Overview

Dataset statistics

Number of variables53
Number of observations412698
Missing cells11917731
Missing cells (%)54.5%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory166.9 MiB
Average record size in memory424.0 B

Variable types

Numeric8
Categorical45

Alerts

saledate has a high cardinality: 4013 distinct valuesHigh cardinality
fiModelDesc has a high cardinality: 5059 distinct valuesHigh cardinality
fiBaseModel has a high cardinality: 1961 distinct valuesHigh cardinality
fiSecondaryDesc has a high cardinality: 177 distinct valuesHigh cardinality
fiModelSeries has a high cardinality: 123 distinct valuesHigh cardinality
fiModelDescriptor has a high cardinality: 140 distinct valuesHigh cardinality
fiProductClassDesc has a high cardinality: 74 distinct valuesHigh cardinality
state has a high cardinality: 53 distinct valuesHigh cardinality
Forks is highly imbalanced (61.9%)Imbalance
Pad_Type is highly imbalanced (70.5%)Imbalance
Transmission is highly imbalanced (60.5%)Imbalance
Turbocharged is highly imbalanced (71.7%)Imbalance
Blade_Extension is highly imbalanced (84.6%)Imbalance
Enclosure_Type is highly imbalanced (57.2%)Imbalance
Engine_Horsepower is highly imbalanced (70.7%)Imbalance
Coupler is highly imbalanced (57.3%)Imbalance
Coupler_System is highly imbalanced (62.6%)Imbalance
Grouser_Tracks is highly imbalanced (64.1%)Imbalance
Hydraulics_Flow is highly imbalanced (93.1%)Imbalance
Undercarriage_Pad_Width is highly imbalanced (68.6%)Imbalance
Stick_Length is highly imbalanced (70.9%)Imbalance
Pattern_Changer is highly imbalanced (71.8%)Imbalance
Grouser_Type is highly imbalanced (61.7%)Imbalance
Backhoe_Mounting is highly imbalanced (99.7%)Imbalance
Travel_Controls is highly imbalanced (72.0%)Imbalance
Differential_Type is highly imbalanced (92.5%)Imbalance
Steering_Controls is highly imbalanced (96.0%)Imbalance
auctioneerID has 20136 (4.9%) missing valuesMissing
MachineHoursCurrentMeter has 265194 (64.3%) missing valuesMissing
UsageBand has 339028 (82.1%) missing valuesMissing
fiSecondaryDesc has 140727 (34.1%) missing valuesMissing
fiModelSeries has 354031 (85.8%) missing valuesMissing
fiModelDescriptor has 337882 (81.9%) missing valuesMissing
ProductSize has 216605 (52.5%) missing valuesMissing
Drive_System has 305611 (74.1%) missing valuesMissing
Forks has 214983 (52.1%) missing valuesMissing
Pad_Type has 331602 (80.3%) missing valuesMissing
Ride_Control has 259970 (63.0%) missing valuesMissing
Stick has 331602 (80.3%) missing valuesMissing
Transmission has 224691 (54.4%) missing valuesMissing
Turbocharged has 331602 (80.3%) missing valuesMissing
Blade_Extension has 386715 (93.7%) missing valuesMissing
Blade_Width has 386715 (93.7%) missing valuesMissing
Enclosure_Type has 386715 (93.7%) missing valuesMissing
Engine_Horsepower has 386715 (93.7%) missing valuesMissing
Hydraulics has 82565 (20.0%) missing valuesMissing
Pushblock has 386715 (93.7%) missing valuesMissing
Ripper has 305753 (74.1%) missing valuesMissing
Scarifier has 386704 (93.7%) missing valuesMissing
Tip_Control has 386715 (93.7%) missing valuesMissing
Tire_Size has 315060 (76.3%) missing valuesMissing
Coupler has 192019 (46.5%) missing valuesMissing
Coupler_System has 367724 (89.1%) missing valuesMissing
Grouser_Tracks has 367823 (89.1%) missing valuesMissing
Hydraulics_Flow has 367823 (89.1%) missing valuesMissing
Track_Type has 310505 (75.2%) missing valuesMissing
Undercarriage_Pad_Width has 309782 (75.1%) missing valuesMissing
Stick_Length has 310437 (75.2%) missing valuesMissing
Thumb has 310366 (75.2%) missing valuesMissing
Pattern_Changer has 310437 (75.2%) missing valuesMissing
Grouser_Type has 310505 (75.2%) missing valuesMissing
Backhoe_Mounting has 331986 (80.4%) missing valuesMissing
Blade_Type has 330823 (80.2%) missing valuesMissing
Travel_Controls has 330821 (80.2%) missing valuesMissing
Differential_Type has 341134 (82.7%) missing valuesMissing
Steering_Controls has 341176 (82.7%) missing valuesMissing
MachineHoursCurrentMeter is highly skewed (γ1 = 37.17158794)Skewed
SalesID has unique valuesUnique
MachineHoursCurrentMeter has 73834 (17.9%) zerosZeros

Reproduction

Analysis started2023-02-19 13:19:49.354420
Analysis finished2023-02-19 13:21:26.620422
Duration1 minute and 37.27 seconds
Software versionpandas-profiling vv3.6.2
Download configurationconfig.json

Variables

SalesID
Real number (ℝ)

Distinct412698
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2011161.2
Minimum1139246
Maximum6333349
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size3.1 MiB
2023-02-19T19:06:26.816304image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Quantile statistics

Minimum1139246
5-th percentile1210925.9
Q11421897.8
median1645852.5
Q32261012.5
95-th percentile4357166.8
Maximum6333349
Range5194103
Interquartile range (IQR)839114.75

Descriptive statistics

Standard deviation1080067.7
Coefficient of variation (CV)0.53703688
Kurtosis7.413359
Mean2011161.2
Median Absolute Deviation (MAD)274217
Skewness2.7207635
Sum8.3000219 × 1011
Variance1.1665463 × 1012
MonotonicityNot monotonic
2023-02-19T19:06:27.104718image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1139246 1
 
< 0.1%
1822460 1
 
< 0.1%
1822471 1
 
< 0.1%
1822470 1
 
< 0.1%
1822469 1
 
< 0.1%
1822468 1
 
< 0.1%
1822466 1
 
< 0.1%
1822465 1
 
< 0.1%
1822464 1
 
< 0.1%
1822463 1
 
< 0.1%
Other values (412688) 412688
> 99.9%
ValueCountFrequency (%)
1139246 1
< 0.1%
1139248 1
< 0.1%
1139249 1
< 0.1%
1139251 1
< 0.1%
1139253 1
< 0.1%
1139255 1
< 0.1%
1139256 1
< 0.1%
1139261 1
< 0.1%
1139272 1
< 0.1%
1139275 1
< 0.1%
ValueCountFrequency (%)
6333349 1
< 0.1%
6333348 1
< 0.1%
6333347 1
< 0.1%
6333345 1
< 0.1%
6333344 1
< 0.1%
6333343 1
< 0.1%
6333342 1
< 0.1%
6333341 1
< 0.1%
6333339 1
< 0.1%
6333338 1
< 0.1%

SalePrice
Real number (ℝ)

Distinct954
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean31215.181
Minimum4750
Maximum142000
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size3.1 MiB
2023-02-19T19:06:27.386725image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Quantile statistics

Minimum4750
5-th percentile8500
Q114500
median24000
Q340000
95-th percentile81000
Maximum142000
Range137250
Interquartile range (IQR)25500

Descriptive statistics

Standard deviation23141.744
Coefficient of variation (CV)0.74136182
Kurtosis2.1588227
Mean31215.181
Median Absolute Deviation (MAD)11250
Skewness1.5177396
Sum1.2882443 × 1010
Variance5.355403 × 108
MonotonicityNot monotonic
2023-02-19T19:06:27.646070image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
25000 7892
 
1.9%
20000 7678
 
1.9%
15000 7568
 
1.8%
26000 7176
 
1.7%
16000 7147
 
1.7%
14000 7142
 
1.7%
17000 6976
 
1.7%
10000 6916
 
1.7%
11000 6905
 
1.7%
13000 6864
 
1.7%
Other values (944) 340434
82.5%
ValueCountFrequency (%)
4750 197
 
< 0.1%
4800 19
 
< 0.1%
4850 7
 
< 0.1%
4900 26
 
< 0.1%
4935 1
 
< 0.1%
4950 3
 
< 0.1%
4987 2
 
< 0.1%
5000 844
0.2%
5100 52
 
< 0.1%
5150 1
 
< 0.1%
ValueCountFrequency (%)
142000 5
 
< 0.1%
141000 13
 
< 0.1%
140000 104
< 0.1%
139000 6
 
< 0.1%
138900 1
 
< 0.1%
138000 11
 
< 0.1%
137500 53
< 0.1%
137000 14
 
< 0.1%
136000 18
 
< 0.1%
135000 98
< 0.1%

MachineID
Real number (ℝ)

Distinct348808
Distinct (%)84.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1230061.4
Minimum0
Maximum2486330
Zeros2
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size3.1 MiB
2023-02-19T19:06:28.374487image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile217982.85
Q11088593.2
median1284397
Q31478079.2
95-th percentile1864816.3
Maximum2486330
Range2486330
Interquartile range (IQR)389486

Descriptive statistics

Standard deviation453953.26
Coefficient of variation (CV)0.36904926
Kurtosis0.90911912
Mean1230061.4
Median Absolute Deviation (MAD)194719
Skewness-0.63454189
Sum5.0764389 × 1011
Variance2.0607356 × 1011
MonotonicityNot monotonic
2023-02-19T19:06:28.663311image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2283592 48
 
< 0.1%
2285830 44
 
< 0.1%
1896854 40
 
< 0.1%
1746392 34
 
< 0.1%
2268800 31
 
< 0.1%
2282547 29
 
< 0.1%
1942724 29
 
< 0.1%
2300370 27
 
< 0.1%
2208545 27
 
< 0.1%
2296335 26
 
< 0.1%
Other values (348798) 412363
99.9%
ValueCountFrequency (%)
0 2
< 0.1%
2 1
< 0.1%
13 1
< 0.1%
17 1
< 0.1%
52 1
< 0.1%
63 1
< 0.1%
66 1
< 0.1%
102 2
< 0.1%
113 1
< 0.1%
116 1
< 0.1%
ValueCountFrequency (%)
2486330 1
< 0.1%
2486276 1
< 0.1%
2486275 1
< 0.1%
2486274 1
< 0.1%
2486273 1
< 0.1%
2486111 1
< 0.1%
2486110 1
< 0.1%
2485633 1
< 0.1%
2485319 1
< 0.1%
2485252 1
< 0.1%

ModelID
Real number (ℝ)

Distinct5281
Distinct (%)1.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean6947.2018
Minimum28
Maximum37198
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size3.1 MiB
2023-02-19T19:06:28.943137image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Quantile statistics

Minimum28
5-th percentile590
Q13261
median4605
Q38899
95-th percentile22114
Maximum37198
Range37170
Interquartile range (IQR)5638

Descriptive statistics

Standard deviation6280.825
Coefficient of variation (CV)0.90407982
Kurtosis3.04093
Mean6947.2018
Median Absolute Deviation (MAD)2440
Skewness1.7466555
Sum2.8670963 × 109
Variance39448762
MonotonicityNot monotonic
2023-02-19T19:06:29.214103image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
4605 5348
 
1.3%
3538 4976
 
1.2%
3170 4364
 
1.1%
4604 4296
 
1.0%
3362 4186
 
1.0%
3537 3748
 
0.9%
3171 3481
 
0.8%
4603 3443
 
0.8%
3357 3270
 
0.8%
3178 3173
 
0.8%
Other values (5271) 372413
90.2%
ValueCountFrequency (%)
28 38
 
< 0.1%
29 17
 
< 0.1%
31 12
 
< 0.1%
34 9
 
< 0.1%
43 706
0.2%
47 245
 
0.1%
50 10
 
< 0.1%
53 57
 
< 0.1%
55 3
 
< 0.1%
75 441
0.1%
ValueCountFrequency (%)
37198 2
 
< 0.1%
37197 20
< 0.1%
37196 22
< 0.1%
36933 2
 
< 0.1%
36932 1
 
< 0.1%
36928 2
 
< 0.1%
36914 2
 
< 0.1%
36894 1
 
< 0.1%
36885 1
 
< 0.1%
36883 1
 
< 0.1%

datasource
Real number (ℝ)

Distinct6
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean135.16936
Minimum121
Maximum173
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size3.1 MiB
2023-02-19T19:06:29.453108image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Quantile statistics

Minimum121
5-th percentile121
Q1132
median132
Q3136
95-th percentile149
Maximum173
Range52
Interquartile range (IQR)4

Descriptive statistics

Standard deviation9.6467486
Coefficient of variation (CV)0.071367864
Kurtosis6.856407
Mean135.16936
Median Absolute Deviation (MAD)0
Skewness2.4381249
Sum55784125
Variance93.059759
MonotonicityNot monotonic
2023-02-19T19:06:29.649095image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
132 260776
63.2%
136 75491
 
18.3%
149 33325
 
8.1%
121 25191
 
6.1%
172 17914
 
4.3%
173 1
 
< 0.1%
ValueCountFrequency (%)
121 25191
 
6.1%
132 260776
63.2%
136 75491
 
18.3%
149 33325
 
8.1%
172 17914
 
4.3%
173 1
 
< 0.1%
ValueCountFrequency (%)
173 1
 
< 0.1%
172 17914
 
4.3%
149 33325
 
8.1%
136 75491
 
18.3%
132 260776
63.2%
121 25191
 
6.1%

auctioneerID
Real number (ℝ)

Distinct30
Distinct (%)< 0.1%
Missing20136
Missing (%)4.9%
Infinite0
Infinite (%)0.0%
Mean6.5852681
Minimum0
Maximum99
Zeros536
Zeros (%)0.1%
Negative0
Negative (%)0.0%
Memory size3.1 MiB
2023-02-19T19:06:29.878149image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1
Q11
median2
Q34
95-th percentile22
Maximum99
Range99
Interquartile range (IQR)3

Descriptive statistics

Standard deviation17.158409
Coefficient of variation (CV)2.6055748
Kurtosis22.855449
Mean6.5852681
Median Absolute Deviation (MAD)1
Skewness4.8088279
Sum2585126
Variance294.41098
MonotonicityNot monotonic
2023-02-19T19:06:30.112111image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram with fixed size bins (bins=30)
ValueCountFrequency (%)
1 192773
46.7%
2 57441
 
13.9%
3 30288
 
7.3%
4 20877
 
5.1%
99 12042
 
2.9%
6 11950
 
2.9%
7 7847
 
1.9%
8 7419
 
1.8%
5 7002
 
1.7%
10 5876
 
1.4%
Other values (20) 39047
 
9.5%
(Missing) 20136
 
4.9%
ValueCountFrequency (%)
0 536
 
0.1%
1 192773
46.7%
2 57441
 
13.9%
3 30288
 
7.3%
4 20877
 
5.1%
5 7002
 
1.7%
6 11950
 
2.9%
7 7847
 
1.9%
8 7419
 
1.8%
9 4764
 
1.2%
ValueCountFrequency (%)
99 12042
2.9%
28 860
 
0.2%
27 1150
 
0.3%
26 796
 
0.2%
25 959
 
0.2%
24 1357
 
0.3%
23 1322
 
0.3%
22 1429
 
0.3%
21 1601
 
0.4%
20 2238
 
0.5%

YearMade
Real number (ℝ)

Distinct73
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1899.0496
Minimum1000
Maximum2014
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size3.1 MiB
2023-02-19T19:06:30.377592image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Quantile statistics

Minimum1000
5-th percentile1000
Q11985
median1995
Q32001
95-th percentile2005
Maximum2014
Range1014
Interquartile range (IQR)16

Descriptive statistics

Standard deviation292.19024
Coefficient of variation (CV)0.1538613
Kurtosis5.5660898
Mean1899.0496
Median Absolute Deviation (MAD)7
Skewness-2.7485982
Sum7.8373399 × 108
Variance85375.138
MonotonicityNot monotonic
2023-02-19T19:06:30.655464image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1000 39391
 
9.5%
2005 22096
 
5.4%
1998 21751
 
5.3%
2004 20914
 
5.1%
1999 19274
 
4.7%
1997 19269
 
4.7%
2000 17238
 
4.2%
1996 17001
 
4.1%
1995 15806
 
3.8%
2003 14661
 
3.6%
Other values (63) 205297
49.7%
ValueCountFrequency (%)
1000 39391
9.5%
1919 127
 
< 0.1%
1920 17
 
< 0.1%
1937 1
 
< 0.1%
1942 1
 
< 0.1%
1947 1
 
< 0.1%
1948 3
 
< 0.1%
1949 1
 
< 0.1%
1950 8
 
< 0.1%
1951 7
 
< 0.1%
ValueCountFrequency (%)
2014 2
 
< 0.1%
2013 1
 
< 0.1%
2012 1
 
< 0.1%
2011 31
 
< 0.1%
2010 58
 
< 0.1%
2009 212
 
0.1%
2008 1691
 
0.4%
2007 5048
 
1.2%
2006 13426
3.3%
2005 22096
5.4%

MachineHoursCurrentMeter
Real number (ℝ)

MISSING  SKEWED  ZEROS 

Distinct15633
Distinct (%)10.6%
Missing265194
Missing (%)64.3%
Infinite0
Infinite (%)0.0%
Mean3522.9883
Minimum0
Maximum2483300
Zeros73834
Zeros (%)17.9%
Negative0
Negative (%)0.0%
Memory size3.1 MiB
2023-02-19T19:06:30.939036image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q33209
95-th percentile10746
Maximum2483300
Range2483300
Interquartile range (IQR)3209

Descriptive statistics

Standard deviation27169.929
Coefficient of variation (CV)7.7121825
Kurtosis1964.1945
Mean3522.9883
Median Absolute Deviation (MAD)0
Skewness37.171588
Sum5.1965486 × 108
Variance7.3820502 × 108
MonotonicityNot monotonic
2023-02-19T19:06:31.245038image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 73834
 
17.9%
2000 124
 
< 0.1%
1000 117
 
< 0.1%
24 115
 
< 0.1%
1500 101
 
< 0.1%
500 97
 
< 0.1%
800 92
 
< 0.1%
1200 88
 
< 0.1%
1400 84
 
< 0.1%
2500 83
 
< 0.1%
Other values (15623) 72769
 
17.6%
(Missing) 265194
64.3%
ValueCountFrequency (%)
0 73834
17.9%
1 2
 
< 0.1%
2 18
 
< 0.1%
3 21
 
< 0.1%
4 35
 
< 0.1%
5 44
 
< 0.1%
6 20
 
< 0.1%
7 12
 
< 0.1%
8 18
 
< 0.1%
9 13
 
< 0.1%
ValueCountFrequency (%)
2483300 1
< 0.1%
2202400 1
< 0.1%
1857100 1
< 0.1%
1729600 1
< 0.1%
1728600 1
< 0.1%
1711700 1
< 0.1%
1602900 1
< 0.1%
1485900 1
< 0.1%
1429800 1
< 0.1%
1282700 1
< 0.1%

UsageBand
Categorical

Distinct3
Distinct (%)< 0.1%
Missing339028
Missing (%)82.1%
Memory size3.1 MiB
Medium
35832 
Low
25311 
High
12527 

Length

Max length6
Median length4
Mean length4.6291978
Min length3

Characters and Unicode

Total characters341033
Distinct characters12
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowLow
2nd rowLow
3rd rowHigh
4th rowHigh
5th rowMedium

Common Values

ValueCountFrequency (%)
Medium 35832
 
8.7%
Low 25311
 
6.1%
High 12527
 
3.0%
(Missing) 339028
82.1%

Length

2023-02-19T19:06:31.510096image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-02-19T19:06:31.781967image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
ValueCountFrequency (%)
medium 35832
48.6%
low 25311
34.4%
high 12527
 
17.0%

Most occurring characters

ValueCountFrequency (%)
i 48359
14.2%
M 35832
10.5%
e 35832
10.5%
d 35832
10.5%
u 35832
10.5%
m 35832
10.5%
L 25311
7.4%
o 25311
7.4%
w 25311
7.4%
H 12527
 
3.7%
Other values (2) 25054
7.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 267363
78.4%
Uppercase Letter 73670
 
21.6%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i 48359
18.1%
e 35832
13.4%
d 35832
13.4%
u 35832
13.4%
m 35832
13.4%
o 25311
9.5%
w 25311
9.5%
g 12527
 
4.7%
h 12527
 
4.7%
Uppercase Letter
ValueCountFrequency (%)
M 35832
48.6%
L 25311
34.4%
H 12527
 
17.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 341033
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
i 48359
14.2%
M 35832
10.5%
e 35832
10.5%
d 35832
10.5%
u 35832
10.5%
m 35832
10.5%
L 25311
7.4%
o 25311
7.4%
w 25311
7.4%
H 12527
 
3.7%
Other values (2) 25054
7.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 341033
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
i 48359
14.2%
M 35832
10.5%
e 35832
10.5%
d 35832
10.5%
u 35832
10.5%
m 35832
10.5%
L 25311
7.4%
o 25311
7.4%
w 25311
7.4%
H 12527
 
3.7%
Other values (2) 25054
7.3%

saledate
Categorical

Distinct4013
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size3.1 MiB
2/16/2009 0:00
 
1932
2/13/2012 0:00
 
1598
2/15/2011 0:00
 
1352
2/19/2008 0:00
 
1300
2/15/2010 0:00
 
1219
Other values (4008)
405297 

Length

Max length15
Median length14
Mean length13.968107
Min length13

Characters and Unicode

Total characters5764610
Distinct characters13
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique197 ?
Unique (%)< 0.1%

Sample

1st row11/16/2006 0:00
2nd row3/26/2004 0:00
3rd row2/26/2004 0:00
4th row5/19/2011 0:00
5th row7/23/2009 0:00

Common Values

ValueCountFrequency (%)
2/16/2009 0:00 1932
 
0.5%
2/13/2012 0:00 1598
 
0.4%
2/15/2011 0:00 1352
 
0.3%
2/19/2008 0:00 1300
 
0.3%
2/15/2010 0:00 1219
 
0.3%
2/11/2008 0:00 1100
 
0.3%
3/26/2009 0:00 1050
 
0.3%
2/3/2008 0:00 1009
 
0.2%
2/9/2009 0:00 998
 
0.2%
1/31/2009 0:00 944
 
0.2%
Other values (4003) 400196
97.0%

Length

2023-02-19T19:06:32.002846image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
0:00 412698
50.0%
2/16/2009 1932
 
0.2%
2/13/2012 1598
 
0.2%
2/15/2011 1352
 
0.2%
2/19/2008 1300
 
0.2%
2/15/2010 1219
 
0.1%
2/11/2008 1100
 
0.1%
3/26/2009 1050
 
0.1%
2/3/2008 1009
 
0.1%
2/9/2009 998
 
0.1%
Other values (4004) 401140
48.6%

Most occurring characters

ValueCountFrequency (%)
0 1933761
33.5%
/ 825396
14.3%
2 629710
 
10.9%
1 551525
 
9.6%
412698
 
7.2%
: 412698
 
7.2%
9 317789
 
5.5%
3 134618
 
2.3%
6 124464
 
2.2%
8 122678
 
2.1%
Other values (3) 299273
 
5.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 4113818
71.4%
Other Punctuation 1238094
 
21.5%
Space Separator 412698
 
7.2%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 1933761
47.0%
2 629710
 
15.3%
1 551525
 
13.4%
9 317789
 
7.7%
3 134618
 
3.3%
6 124464
 
3.0%
8 122678
 
3.0%
5 103891
 
2.5%
7 98677
 
2.4%
4 96705
 
2.4%
Other Punctuation
ValueCountFrequency (%)
/ 825396
66.7%
: 412698
33.3%
Space Separator
ValueCountFrequency (%)
412698
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 5764610
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 1933761
33.5%
/ 825396
14.3%
2 629710
 
10.9%
1 551525
 
9.6%
412698
 
7.2%
: 412698
 
7.2%
9 317789
 
5.5%
3 134618
 
2.3%
6 124464
 
2.2%
8 122678
 
2.1%
Other values (3) 299273
 
5.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 5764610
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 1933761
33.5%
/ 825396
14.3%
2 629710
 
10.9%
1 551525
 
9.6%
412698
 
7.2%
: 412698
 
7.2%
9 317789
 
5.5%
3 134618
 
2.3%
6 124464
 
2.2%
8 122678
 
2.1%
Other values (3) 299273
 
5.2%

fiModelDesc
Categorical

Distinct5059
Distinct (%)1.2%
Missing0
Missing (%)0.0%
Memory size3.1 MiB
310G
 
5348
416C
 
4976
580K
 
4364
310E
 
4296
140G
 
4186
Other values (5054)
389528 

Length

Max length19
Median length17
Mean length4.7020994
Min length1

Characters and Unicode

Total characters1940547
Distinct characters58
Distinct categories9 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique866 ?
Unique (%)0.2%

Sample

1st row521D
2nd row950FII
3rd row226
4th rowPC120-6E
5th rowS175

Common Values

ValueCountFrequency (%)
310G 5348
 
1.3%
416C 4976
 
1.2%
580K 4364
 
1.1%
310E 4296
 
1.0%
140G 4186
 
1.0%
416B 3765
 
0.9%
580L 3481
 
0.8%
310D 3443
 
0.8%
12G 3270
 
0.8%
580SUPER L 3173
 
0.8%
Other values (5049) 372396
90.2%

Length

2023-02-19T19:06:32.268190image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
580super 5924
 
1.4%
310g 5348
 
1.3%
416c 4976
 
1.2%
580k 4364
 
1.0%
310e 4296
 
1.0%
140g 4186
 
1.0%
416b 3765
 
0.9%
l 3596
 
0.9%
580l 3483
 
0.8%
310d 3443
 
0.8%
Other values (5046) 376783
89.7%

Most occurring characters

ValueCountFrequency (%)
0 233804
 
12.0%
5 151223
 
7.8%
1 136164
 
7.0%
2 124327
 
6.4%
3 121688
 
6.3%
4 102206
 
5.3%
6 97655
 
5.0%
C 97550
 
5.0%
L 95509
 
4.9%
D 87192
 
4.5%
Other values (48) 693229
35.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1140913
58.8%
Uppercase Letter 764892
39.4%
Dash Punctuation 24853
 
1.3%
Space Separator 7818
 
0.4%
Other Punctuation 1968
 
0.1%
Lowercase Letter 67
 
< 0.1%
Math Symbol 28
 
< 0.1%
Open Punctuation 4
 
< 0.1%
Close Punctuation 4
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
C 97550
12.8%
L 95509
12.5%
D 87192
11.4%
G 57872
 
7.6%
E 51887
 
6.8%
I 49538
 
6.5%
B 47360
 
6.2%
P 46443
 
6.1%
S 30337
 
4.0%
X 30168
 
3.9%
Other values (16) 171036
22.4%
Lowercase Letter
ValueCountFrequency (%)
t 11
16.4%
e 9
13.4%
i 8
11.9%
o 6
9.0%
a 6
9.0%
m 4
 
6.0%
h 4
 
6.0%
g 4
 
6.0%
r 3
 
4.5%
s 3
 
4.5%
Other values (5) 9
13.4%
Decimal Number
ValueCountFrequency (%)
0 233804
20.5%
5 151223
13.3%
1 136164
11.9%
2 124327
10.9%
3 121688
10.7%
4 102206
9.0%
6 97655
8.6%
8 73455
 
6.4%
7 51119
 
4.5%
9 49272
 
4.3%
Other Punctuation
ValueCountFrequency (%)
. 1965
99.8%
/ 3
 
0.2%
Dash Punctuation
ValueCountFrequency (%)
- 24853
100.0%
Space Separator
ValueCountFrequency (%)
7818
100.0%
Math Symbol
ValueCountFrequency (%)
+ 28
100.0%
Open Punctuation
ValueCountFrequency (%)
( 4
100.0%
Close Punctuation
ValueCountFrequency (%)
) 4
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1175588
60.6%
Latin 764959
39.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
C 97550
12.8%
L 95509
12.5%
D 87192
11.4%
G 57872
 
7.6%
E 51887
 
6.8%
I 49538
 
6.5%
B 47360
 
6.2%
P 46443
 
6.1%
S 30337
 
4.0%
X 30168
 
3.9%
Other values (31) 171103
22.4%
Common
ValueCountFrequency (%)
0 233804
19.9%
5 151223
12.9%
1 136164
11.6%
2 124327
10.6%
3 121688
10.4%
4 102206
8.7%
6 97655
8.3%
8 73455
 
6.2%
7 51119
 
4.3%
9 49272
 
4.2%
Other values (7) 34675
 
2.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1940547
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 233804
 
12.0%
5 151223
 
7.8%
1 136164
 
7.0%
2 124327
 
6.4%
3 121688
 
6.3%
4 102206
 
5.3%
6 97655
 
5.0%
C 97550
 
5.0%
L 95509
 
4.9%
D 87192
 
4.5%
Other values (48) 693229
35.7%

fiBaseModel
Categorical

Distinct1961
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size3.1 MiB
580
 
20179
310
 
17886
D6
 
13527
416
 
12900
D5
 
9636
Other values (1956)
338570 

Length

Max length13
Median length3
Mean length3.2178566
Min length1

Characters and Unicode

Total characters1328003
Distinct characters39
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique223 ?
Unique (%)0.1%

Sample

1st row521
2nd row950
3rd row226
4th rowPC120
5th rowS175

Common Values

ValueCountFrequency (%)
580 20179
 
4.9%
310 17886
 
4.3%
D6 13527
 
3.3%
416 12900
 
3.1%
D5 9636
 
2.3%
950 7605
 
1.8%
D3 6945
 
1.7%
D8 6903
 
1.7%
D4 6574
 
1.6%
12 6301
 
1.5%
Other values (1951) 304242
73.7%

Length

2023-02-19T19:06:32.519033image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
580 20179
 
4.9%
310 17886
 
4.3%
d6 13527
 
3.3%
416 12900
 
3.1%
d5 9636
 
2.3%
950 7605
 
1.8%
d3 6945
 
1.7%
d8 6903
 
1.7%
d4 6574
 
1.6%
12 6301
 
1.5%
Other values (1921) 304739
73.8%

Most occurring characters

ValueCountFrequency (%)
0 233138
17.6%
5 144826
10.9%
1 127402
9.6%
2 116403
8.8%
3 115146
8.7%
4 102074
7.7%
6 90301
 
6.8%
8 71818
 
5.4%
D 66704
 
5.0%
9 49269
 
3.7%
Other values (29) 210922
15.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1098319
82.7%
Uppercase Letter 227098
 
17.1%
Other Punctuation 1931
 
0.1%
Space Separator 497
 
< 0.1%
Dash Punctuation 158
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
D 66704
29.4%
C 21939
 
9.7%
E 17170
 
7.6%
P 16986
 
7.5%
X 16509
 
7.3%
S 14773
 
6.5%
T 13235
 
5.8%
L 10438
 
4.6%
W 9541
 
4.2%
A 8697
 
3.8%
Other values (15) 31106
13.7%
Decimal Number
ValueCountFrequency (%)
0 233138
21.2%
5 144826
13.2%
1 127402
11.6%
2 116403
10.6%
3 115146
10.5%
4 102074
9.3%
6 90301
 
8.2%
8 71818
 
6.5%
9 49269
 
4.5%
7 47942
 
4.4%
Other Punctuation
ValueCountFrequency (%)
. 1929
99.9%
/ 2
 
0.1%
Space Separator
ValueCountFrequency (%)
497
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 158
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1100905
82.9%
Latin 227098
 
17.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
D 66704
29.4%
C 21939
 
9.7%
E 17170
 
7.6%
P 16986
 
7.5%
X 16509
 
7.3%
S 14773
 
6.5%
T 13235
 
5.8%
L 10438
 
4.6%
W 9541
 
4.2%
A 8697
 
3.8%
Other values (15) 31106
13.7%
Common
ValueCountFrequency (%)
0 233138
21.2%
5 144826
13.2%
1 127402
11.6%
2 116403
10.6%
3 115146
10.5%
4 102074
9.3%
6 90301
 
8.2%
8 71818
 
6.5%
9 49269
 
4.5%
7 47942
 
4.4%
Other values (4) 2586
 
0.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1328003
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 233138
17.6%
5 144826
10.9%
1 127402
9.6%
2 116403
8.8%
3 115146
8.7%
4 102074
7.7%
6 90301
 
6.8%
8 71818
 
5.4%
D 66704
 
5.0%
9 49269
 
3.7%
Other values (29) 210922
15.9%

fiSecondaryDesc
Categorical

HIGH CARDINALITY  MISSING 

Distinct177
Distinct (%)0.1%
Missing140727
Missing (%)34.1%
Memory size3.1 MiB
C
44431 
B
40165 
G
37915 
H
24729 
E
21532 
Other values (172)
103199 

Length

Max length13
Median length1
Mean length1.2444378
Min length1

Characters and Unicode

Total characters338451
Distinct characters37
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique32 ?
Unique (%)< 0.1%

Sample

1st rowD
2nd rowF
3rd rowG
4th rowE
5th rowD

Common Values

ValueCountFrequency (%)
C 44431
 
10.8%
B 40165
 
9.7%
G 37915
 
9.2%
H 24729
 
6.0%
E 21532
 
5.2%
D 20023
 
4.9%
F 9420
 
2.3%
K 7979
 
1.9%
L 5628
 
1.4%
A 5596
 
1.4%
Other values (167) 54553
 
13.2%
(Missing) 140727
34.1%

Length

2023-02-19T19:06:32.743893image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
c 44441
15.9%
b 40221
14.4%
g 37924
13.6%
h 24730
8.9%
e 21815
 
7.8%
d 20023
 
7.2%
l 9535
 
3.4%
f 9420
 
3.4%
k 8906
 
3.2%
m 7116
 
2.6%
Other values (151) 54724
19.6%

Most occurring characters

ValueCountFrequency (%)
C 49068
14.5%
B 40862
12.1%
G 39880
11.8%
E 33708
10.0%
H 25122
 
7.4%
D 20276
 
6.0%
L 17445
 
5.2%
R 13192
 
3.9%
P 13011
 
3.8%
S 12658
 
3.7%
Other values (27) 73229
21.6%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 330809
97.7%
Space Separator 7237
 
2.1%
Decimal Number 325
 
0.1%
Dash Punctuation 44
 
< 0.1%
Other Punctuation 36
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
C 49068
14.8%
B 40862
12.4%
G 39880
12.1%
E 33708
10.2%
H 25122
 
7.6%
D 20276
 
6.1%
L 17445
 
5.3%
R 13192
 
4.0%
P 13011
 
3.9%
S 12658
 
3.8%
Other values (14) 65587
19.8%
Decimal Number
ValueCountFrequency (%)
3 133
40.9%
7 132
40.6%
2 25
 
7.7%
5 15
 
4.6%
0 10
 
3.1%
1 7
 
2.2%
9 2
 
0.6%
6 1
 
0.3%
Other Punctuation
ValueCountFrequency (%)
# 14
38.9%
? 14
38.9%
. 8
22.2%
Space Separator
ValueCountFrequency (%)
7237
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 44
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 330809
97.7%
Common 7642
 
2.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
C 49068
14.8%
B 40862
12.4%
G 39880
12.1%
E 33708
10.2%
H 25122
 
7.6%
D 20276
 
6.1%
L 17445
 
5.3%
R 13192
 
4.0%
P 13011
 
3.9%
S 12658
 
3.8%
Other values (14) 65587
19.8%
Common
ValueCountFrequency (%)
7237
94.7%
3 133
 
1.7%
7 132
 
1.7%
- 44
 
0.6%
2 25
 
0.3%
5 15
 
0.2%
# 14
 
0.2%
? 14
 
0.2%
0 10
 
0.1%
. 8
 
0.1%
Other values (3) 10
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 338451
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
C 49068
14.5%
B 40862
12.1%
G 39880
11.8%
E 33708
10.0%
H 25122
 
7.4%
D 20276
 
6.0%
L 17445
 
5.2%
R 13192
 
3.9%
P 13011
 
3.8%
S 12658
 
3.7%
Other values (27) 73229
21.6%

fiModelSeries
Categorical

HIGH CARDINALITY  MISSING 

Distinct123
Distinct (%)0.2%
Missing354031
Missing (%)85.8%
Memory size3.1 MiB
II
13770 
LC
9175 
III
5351 
-1
4646 
-2
4033 
Other values (118)
21692 

Length

Max length11
Median length2
Mean length2.1856069
Min length1

Characters and Unicode

Total characters128223
Distinct characters48
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique22 ?
Unique (%)< 0.1%

Sample

1st rowII
2nd row-6E
3rd rowLC
4th rowII
5th rowII

Common Values

ValueCountFrequency (%)
II 13770
 
3.3%
LC 9175
 
2.2%
III 5351
 
1.3%
-1 4646
 
1.1%
-2 4033
 
1.0%
-6 3229
 
0.8%
-3 2611
 
0.6%
-5 2505
 
0.6%
-12 1386
 
0.3%
-7 921
 
0.2%
Other values (113) 11040
 
2.7%
(Missing) 354031
85.8%

Length

2023-02-19T19:06:32.960811image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
ii 13770
23.5%
lc 9175
15.6%
iii 5352
 
9.1%
1 4852
 
8.3%
2 4420
 
7.5%
6 4005
 
6.8%
3 3179
 
5.4%
5 3076
 
5.2%
7 1640
 
2.8%
12 1408
 
2.4%
Other values (96) 7790
13.3%

Most occurring characters

ValueCountFrequency (%)
I 44328
34.6%
- 24224
18.9%
L 10622
 
8.3%
C 10325
 
8.1%
1 8696
 
6.8%
2 7315
 
5.7%
6 4394
 
3.4%
3 4358
 
3.4%
5 4062
 
3.2%
7 2072
 
1.6%
Other values (38) 7827
 
6.1%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 69804
54.4%
Decimal Number 33111
25.8%
Dash Punctuation 24224
 
18.9%
Other Punctuation 935
 
0.7%
Math Symbol 109
 
0.1%
Lowercase Letter 39
 
< 0.1%
Space Separator 1
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
I 44328
63.5%
L 10622
 
15.2%
C 10325
 
14.8%
V 1192
 
1.7%
M 764
 
1.1%
E 670
 
1.0%
N 584
 
0.8%
A 487
 
0.7%
T 255
 
0.4%
S 105
 
0.2%
Other values (13) 472
 
0.7%
Decimal Number
ValueCountFrequency (%)
1 8696
26.3%
2 7315
22.1%
6 4394
13.3%
3 4358
13.2%
5 4062
12.3%
7 2072
 
6.3%
8 1329
 
4.0%
0 865
 
2.6%
4 19
 
0.1%
9 1
 
< 0.1%
Lowercase Letter
ValueCountFrequency (%)
t 9
23.1%
e 7
17.9%
m 4
10.3%
a 4
10.3%
i 4
10.3%
o 4
10.3%
s 3
 
7.7%
r 3
 
7.7%
l 1
 
2.6%
Other Punctuation
ValueCountFrequency (%)
# 413
44.2%
? 413
44.2%
. 109
 
11.7%
Dash Punctuation
ValueCountFrequency (%)
- 24224
100.0%
Math Symbol
ValueCountFrequency (%)
+ 109
100.0%
Space Separator
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 69843
54.5%
Common 58380
45.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
I 44328
63.5%
L 10622
 
15.2%
C 10325
 
14.8%
V 1192
 
1.7%
M 764
 
1.1%
E 670
 
1.0%
N 584
 
0.8%
A 487
 
0.7%
T 255
 
0.4%
S 105
 
0.2%
Other values (22) 511
 
0.7%
Common
ValueCountFrequency (%)
- 24224
41.5%
1 8696
 
14.9%
2 7315
 
12.5%
6 4394
 
7.5%
3 4358
 
7.5%
5 4062
 
7.0%
7 2072
 
3.5%
8 1329
 
2.3%
0 865
 
1.5%
# 413
 
0.7%
Other values (6) 652
 
1.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 128223
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
I 44328
34.6%
- 24224
18.9%
L 10622
 
8.3%
C 10325
 
8.1%
1 8696
 
6.8%
2 7315
 
5.7%
6 4394
 
3.4%
3 4358
 
3.4%
5 4062
 
3.2%
7 2072
 
1.6%
Other values (38) 7827
 
6.1%

fiModelDescriptor
Categorical

HIGH CARDINALITY  MISSING 

Distinct140
Distinct (%)0.2%
Missing337882
Missing (%)81.9%
Memory size3.1 MiB
L
16464 
LGP
16143 
LC
13295 
XL
6700 
6
2944 
Other values (135)
19270 

Length

Max length14
Median length10
Mean length1.8867889
Min length1

Characters and Unicode

Total characters141162
Distinct characters50
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique23 ?
Unique (%)< 0.1%

Sample

1st rowLC
2nd row6
3rd rowL
4th rowLT
5th rowCR

Common Values

ValueCountFrequency (%)
L 16464
 
4.0%
LGP 16143
 
3.9%
LC 13295
 
3.2%
XL 6700
 
1.6%
6 2944
 
0.7%
LT 2468
 
0.6%
5 2301
 
0.6%
3 1929
 
0.5%
CR 1798
 
0.4%
H 1071
 
0.3%
Other values (130) 9703
 
2.4%
(Missing) 337882
81.9%

Length

2023-02-19T19:06:33.183340image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
l 16464
22.0%
lgp 16152
21.6%
lc 13295
17.8%
xl 6700
9.0%
6 2944
 
3.9%
lt 2468
 
3.3%
5 2301
 
3.1%
3 1929
 
2.6%
cr 1798
 
2.4%
h 1071
 
1.4%
Other values (127) 9701
13.0%

Most occurring characters

ValueCountFrequency (%)
L 57004
40.4%
P 16393
 
11.6%
G 16244
 
11.5%
C 16218
 
11.5%
X 7646
 
5.4%
T 4341
 
3.1%
R 3979
 
2.8%
6 2946
 
2.1%
S 2801
 
2.0%
5 2311
 
1.6%
Other values (40) 11279
 
8.0%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 131250
93.0%
Decimal Number 9686
 
6.9%
Other Punctuation 78
 
0.1%
Math Symbol 77
 
0.1%
Space Separator 35
 
< 0.1%
Lowercase Letter 28
 
< 0.1%
Close Punctuation 4
 
< 0.1%
Open Punctuation 4
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
L 57004
43.4%
P 16393
 
12.5%
G 16244
 
12.4%
C 16218
 
12.4%
X 7646
 
5.8%
T 4341
 
3.3%
R 3979
 
3.0%
S 2801
 
2.1%
H 1353
 
1.0%
Z 1065
 
0.8%
Other values (14) 4206
 
3.2%
Lowercase Letter
ValueCountFrequency (%)
g 4
14.3%
i 4
14.3%
h 4
14.3%
n 2
7.1%
e 2
7.1%
a 2
7.1%
c 2
7.1%
o 2
7.1%
f 2
7.1%
t 2
7.1%
Decimal Number
ValueCountFrequency (%)
6 2946
30.4%
5 2311
23.9%
3 2060
21.3%
7 973
 
10.0%
2 590
 
6.1%
0 309
 
3.2%
8 308
 
3.2%
4 120
 
1.2%
1 69
 
0.7%
Other Punctuation
ValueCountFrequency (%)
. 77
98.7%
/ 1
 
1.3%
Math Symbol
ValueCountFrequency (%)
+ 77
100.0%
Space Separator
ValueCountFrequency (%)
35
100.0%
Close Punctuation
ValueCountFrequency (%)
) 4
100.0%
Open Punctuation
ValueCountFrequency (%)
( 4
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 131278
93.0%
Common 9884
 
7.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
L 57004
43.4%
P 16393
 
12.5%
G 16244
 
12.4%
C 16218
 
12.4%
X 7646
 
5.8%
T 4341
 
3.3%
R 3979
 
3.0%
S 2801
 
2.1%
H 1353
 
1.0%
Z 1065
 
0.8%
Other values (25) 4234
 
3.2%
Common
ValueCountFrequency (%)
6 2946
29.8%
5 2311
23.4%
3 2060
20.8%
7 973
 
9.8%
2 590
 
6.0%
0 309
 
3.1%
8 308
 
3.1%
4 120
 
1.2%
. 77
 
0.8%
+ 77
 
0.8%
Other values (5) 113
 
1.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 141162
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
L 57004
40.4%
P 16393
 
11.6%
G 16244
 
11.5%
C 16218
 
11.5%
X 7646
 
5.4%
T 4341
 
3.1%
R 3979
 
2.8%
6 2946
 
2.1%
S 2801
 
2.0%
5 2311
 
1.6%
Other values (40) 11279
 
8.0%

ProductSize
Categorical

Distinct6
Distinct (%)< 0.1%
Missing216605
Missing (%)52.5%
Memory size3.1 MiB
Medium
64342 
Large / Medium
51297 
Small
27057 
Mini
25721 
Large
21396 

Length

Max length14
Median length7
Mean length7.6153611
Min length4

Characters and Unicode

Total characters1493319
Distinct characters20
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowMedium
2nd rowSmall
3rd rowLarge / Medium
4th rowMini
5th rowLarge

Common Values

ValueCountFrequency (%)
Medium 64342
 
15.6%
Large / Medium 51297
 
12.4%
Small 27057
 
6.6%
Mini 25721
 
6.2%
Large 21396
 
5.2%
Compact 6280
 
1.5%
(Missing) 216605
52.5%

Length

2023-02-19T19:06:33.416259image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-02-19T19:06:33.688602image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
ValueCountFrequency (%)
medium 115639
38.7%
large 72693
24.3%
51297
17.2%
small 27057
 
9.1%
mini 25721
 
8.6%
compact 6280
 
2.1%

Most occurring characters

ValueCountFrequency (%)
e 188332
12.6%
i 167081
11.2%
m 148976
10.0%
M 141360
9.5%
d 115639
7.7%
u 115639
7.7%
a 106030
7.1%
102594
 
6.9%
r 72693
 
4.9%
g 72693
 
4.9%
Other values (10) 262282
17.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1092038
73.1%
Uppercase Letter 247390
 
16.6%
Space Separator 102594
 
6.9%
Other Punctuation 51297
 
3.4%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 188332
17.2%
i 167081
15.3%
m 148976
13.6%
d 115639
10.6%
u 115639
10.6%
a 106030
9.7%
r 72693
 
6.7%
g 72693
 
6.7%
l 54114
 
5.0%
n 25721
 
2.4%
Other values (4) 25120
 
2.3%
Uppercase Letter
ValueCountFrequency (%)
M 141360
57.1%
L 72693
29.4%
S 27057
 
10.9%
C 6280
 
2.5%
Space Separator
ValueCountFrequency (%)
102594
100.0%
Other Punctuation
ValueCountFrequency (%)
/ 51297
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1339428
89.7%
Common 153891
 
10.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 188332
14.1%
i 167081
12.5%
m 148976
11.1%
M 141360
10.6%
d 115639
8.6%
u 115639
8.6%
a 106030
7.9%
r 72693
 
5.4%
g 72693
 
5.4%
L 72693
 
5.4%
Other values (8) 138292
10.3%
Common
ValueCountFrequency (%)
102594
66.7%
/ 51297
33.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1493319
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 188332
12.6%
i 167081
11.2%
m 148976
10.0%
M 141360
9.5%
d 115639
7.7%
u 115639
7.7%
a 106030
7.1%
102594
 
6.9%
r 72693
 
4.9%
g 72693
 
4.9%
Other values (10) 262282
17.6%
Distinct74
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size3.1 MiB
Backhoe Loader - 14.0 to 15.0 Ft Standard Digging Depth
57542 
Track Type Tractor, Dozer - 20.0 to 75.0 Horsepower
 
18131
Wheel Loader - 150.0 to 175.0 Horsepower
 
15537
Track Type Tractor, Dozer - 85.0 to 105.0 Horsepower
 
15161
Hydraulic Excavator, Track - 21.0 to 24.0 Metric Tons
 
13736
Other values (69)
292591 

Length

Max length64
Median length57
Mean length49.730275
Min length26

Characters and Unicode

Total characters20523585
Distinct characters54
Distinct categories9 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)< 0.1%

Sample

1st rowWheel Loader - 110.0 to 120.0 Horsepower
2nd rowWheel Loader - 150.0 to 175.0 Horsepower
3rd rowSkid Steer Loader - 1351.0 to 1601.0 Lb Operating Capacity
4th rowHydraulic Excavator, Track - 12.0 to 14.0 Metric Tons
5th rowSkid Steer Loader - 1601.0 to 1751.0 Lb Operating Capacity

Common Values

ValueCountFrequency (%)
Backhoe Loader - 14.0 to 15.0 Ft Standard Digging Depth 57542
 
13.9%
Track Type Tractor, Dozer - 20.0 to 75.0 Horsepower 18131
 
4.4%
Wheel Loader - 150.0 to 175.0 Horsepower 15537
 
3.8%
Track Type Tractor, Dozer - 85.0 to 105.0 Horsepower 15161
 
3.7%
Hydraulic Excavator, Track - 21.0 to 24.0 Metric Tons 13736
 
3.3%
Track Type Tractor, Dozer - 130.0 to 160.0 Horsepower 11530
 
2.8%
Hydraulic Excavator, Track - 12.0 to 14.0 Metric Tons 11527
 
2.8%
Track Type Tractor, Dozer - 260.0 + Horsepower 11227
 
2.7%
Wheel Loader - 120.0 to 135.0 Horsepower 10912
 
2.6%
Backhoe Loader - 15.0 to 16.0 Ft Standard Digging Depth 10847
 
2.6%
Other values (64) 236548
57.3%

Length

2023-02-19T19:06:33.981939image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
430478
 
12.1%
to 387133
 
10.9%
loader 199628
 
5.6%
track 186812
 
5.2%
horsepower 181494
 
5.1%
excavator 104230
 
2.9%
hydraulic 104230
 
2.9%
tons 104075
 
2.9%
metric 104075
 
2.9%
type 82582
 
2.3%
Other values (73) 1680198
47.1%

Most occurring characters

ValueCountFrequency (%)
3152237
15.4%
r 1555644
 
7.6%
o 1457249
 
7.1%
e 1308114
 
6.4%
a 1177851
 
5.7%
0 1128956
 
5.5%
t 1076538
 
5.2%
. 794814
 
3.9%
c 707624
 
3.4%
1 584496
 
2.8%
Other values (44) 7580062
36.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 11251286
54.8%
Space Separator 3152237
 
15.4%
Decimal Number 2752604
 
13.4%
Uppercase Letter 1952510
 
9.5%
Other Punctuation 981626
 
4.8%
Dash Punctuation 412698
 
2.0%
Math Symbol 20548
 
0.1%
Open Punctuation 38
 
< 0.1%
Close Punctuation 38
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
r 1555644
13.8%
o 1457249
13.0%
e 1308114
11.6%
a 1177851
10.5%
t 1076538
9.6%
c 707624
 
6.3%
d 540949
 
4.8%
i 512629
 
4.6%
p 430444
 
3.8%
n 314191
 
2.8%
Other values (14) 2170053
19.3%
Uppercase Letter
ValueCountFrequency (%)
T 456051
23.4%
H 285724
14.6%
L 243846
12.5%
D 238370
12.2%
S 167916
 
8.6%
M 130333
 
6.7%
E 104230
 
5.3%
B 81401
 
4.2%
F 77894
 
4.0%
W 73216
 
3.7%
Other values (3) 93529
 
4.8%
Decimal Number
ValueCountFrequency (%)
0 1128956
41.0%
1 584496
21.2%
5 300242
 
10.9%
2 197801
 
7.2%
4 145995
 
5.3%
3 113186
 
4.1%
6 103535
 
3.8%
7 95226
 
3.5%
8 50274
 
1.8%
9 32893
 
1.2%
Other Punctuation
ValueCountFrequency (%)
. 794814
81.0%
, 186812
 
19.0%
Space Separator
ValueCountFrequency (%)
3152237
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 412698
100.0%
Math Symbol
ValueCountFrequency (%)
+ 20548
100.0%
Open Punctuation
ValueCountFrequency (%)
( 38
100.0%
Close Punctuation
ValueCountFrequency (%)
) 38
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 13203796
64.3%
Common 7319789
35.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
r 1555644
 
11.8%
o 1457249
 
11.0%
e 1308114
 
9.9%
a 1177851
 
8.9%
t 1076538
 
8.2%
c 707624
 
5.4%
d 540949
 
4.1%
i 512629
 
3.9%
T 456051
 
3.5%
p 430444
 
3.3%
Other values (27) 3980703
30.1%
Common
ValueCountFrequency (%)
3152237
43.1%
0 1128956
 
15.4%
. 794814
 
10.9%
1 584496
 
8.0%
- 412698
 
5.6%
5 300242
 
4.1%
2 197801
 
2.7%
, 186812
 
2.6%
4 145995
 
2.0%
3 113186
 
1.5%
Other values (7) 302552
 
4.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 20523585
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
3152237
15.4%
r 1555644
 
7.6%
o 1457249
 
7.1%
e 1308114
 
6.4%
a 1177851
 
5.7%
0 1128956
 
5.5%
t 1076538
 
5.2%
. 794814
 
3.9%
c 707624
 
3.4%
1 584496
 
2.8%
Other values (44) 7580062
36.9%

state
Categorical

Distinct53
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size3.1 MiB
Florida
67320 
Texas
53110 
California
29761 
Washington
 
16222
Georgia
 
14633
Other values (48)
231652 

Length

Max length14
Median length12
Mean length8.0790869
Min length4

Characters and Unicode

Total characters3334223
Distinct characters46
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowAlabama
2nd rowNorth Carolina
3rd rowNew York
4th rowTexas
5th rowNew York

Common Values

ValueCountFrequency (%)
Florida 67320
 
16.3%
Texas 53110
 
12.9%
California 29761
 
7.2%
Washington 16222
 
3.9%
Georgia 14633
 
3.5%
Maryland 13322
 
3.2%
Mississippi 13240
 
3.2%
Ohio 12369
 
3.0%
Illinois 11540
 
2.8%
Colorado 11529
 
2.8%
Other values (43) 169652
41.1%

Length

2023-02-19T19:06:34.252334image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
florida 67320
 
14.6%
texas 53110
 
11.5%
california 29761
 
6.5%
new 26164
 
5.7%
carolina 20587
 
4.5%
washington 16224
 
3.5%
georgia 14633
 
3.2%
maryland 13322
 
2.9%
mississippi 13240
 
2.9%
ohio 12369
 
2.7%
Other values (46) 194410
42.2%

Most occurring characters

ValueCountFrequency (%)
a 433977
13.0%
i 371936
 
11.2%
o 293474
 
8.8%
n 250517
 
7.5%
s 228524
 
6.9%
e 218565
 
6.6%
r 218098
 
6.5%
l 188474
 
5.7%
d 108219
 
3.2%
t 74305
 
2.2%
Other values (36) 948134
28.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 2824639
84.7%
Uppercase Letter 461142
 
13.8%
Space Separator 48442
 
1.5%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 433977
15.4%
i 371936
13.2%
o 293474
10.4%
n 250517
8.9%
s 228524
8.1%
e 218565
7.7%
r 218098
7.7%
l 188474
6.7%
d 108219
 
3.8%
t 74305
 
2.6%
Other values (14) 438550
15.5%
Uppercase Letter
ValueCountFrequency (%)
C 70155
15.2%
F 67320
14.6%
T 63408
13.8%
M 53866
11.7%
N 45078
9.8%
A 24019
 
5.2%
W 21481
 
4.7%
I 19108
 
4.1%
O 15606
 
3.4%
G 14633
 
3.2%
Other values (11) 66468
14.4%
Space Separator
ValueCountFrequency (%)
48442
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 3285781
98.5%
Common 48442
 
1.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 433977
13.2%
i 371936
11.3%
o 293474
 
8.9%
n 250517
 
7.6%
s 228524
 
7.0%
e 218565
 
6.7%
r 218098
 
6.6%
l 188474
 
5.7%
d 108219
 
3.3%
t 74305
 
2.3%
Other values (35) 899692
27.4%
Common
ValueCountFrequency (%)
48442
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3334223
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 433977
13.0%
i 371936
 
11.2%
o 293474
 
8.8%
n 250517
 
7.5%
s 228524
 
6.9%
e 218565
 
6.6%
r 218098
 
6.5%
l 188474
 
5.7%
d 108219
 
3.2%
t 74305
 
2.2%
Other values (36) 948134
28.4%

ProductGroup
Categorical

Distinct6
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size3.1 MiB
TEX
104230 
TTT
82582 
BL
81401 
WL
73216 
SSL
45011 

Length

Max length3
Median length3
Mean length2.5617255
Min length2

Characters and Unicode

Total characters1057219
Distinct characters9
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowWL
2nd rowWL
3rd rowSSL
4th rowTEX
5th rowSSL

Common Values

ValueCountFrequency (%)
TEX 104230
25.3%
TTT 82582
20.0%
BL 81401
19.7%
WL 73216
17.7%
SSL 45011
10.9%
MG 26258
 
6.4%

Length

2023-02-19T19:06:34.472839image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-02-19T19:06:34.740812image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
ValueCountFrequency (%)
tex 104230
25.3%
ttt 82582
20.0%
bl 81401
19.7%
wl 73216
17.7%
ssl 45011
10.9%
mg 26258
 
6.4%

Most occurring characters

ValueCountFrequency (%)
T 351976
33.3%
L 199628
18.9%
E 104230
 
9.9%
X 104230
 
9.9%
S 90022
 
8.5%
B 81401
 
7.7%
W 73216
 
6.9%
M 26258
 
2.5%
G 26258
 
2.5%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 1057219
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
T 351976
33.3%
L 199628
18.9%
E 104230
 
9.9%
X 104230
 
9.9%
S 90022
 
8.5%
B 81401
 
7.7%
W 73216
 
6.9%
M 26258
 
2.5%
G 26258
 
2.5%

Most occurring scripts

ValueCountFrequency (%)
Latin 1057219
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
T 351976
33.3%
L 199628
18.9%
E 104230
 
9.9%
X 104230
 
9.9%
S 90022
 
8.5%
B 81401
 
7.7%
W 73216
 
6.9%
M 26258
 
2.5%
G 26258
 
2.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1057219
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
T 351976
33.3%
L 199628
18.9%
E 104230
 
9.9%
X 104230
 
9.9%
S 90022
 
8.5%
B 81401
 
7.7%
W 73216
 
6.9%
M 26258
 
2.5%
G 26258
 
2.5%

ProductGroupDesc
Categorical

Distinct6
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size3.1 MiB
Track Excavators
104230 
Track Type Tractors
82582 
Backhoe Loaders
81401 
Wheel Loader
73216 
Skid Steer Loaders
45011 

Length

Max length19
Median length16
Mean length15.720689
Min length12

Characters and Unicode

Total characters6487897
Distinct characters25
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowWheel Loader
2nd rowWheel Loader
3rd rowSkid Steer Loaders
4th rowTrack Excavators
5th rowSkid Steer Loaders

Common Values

ValueCountFrequency (%)
Track Excavators 104230
25.3%
Track Type Tractors 82582
20.0%
Backhoe Loaders 81401
19.7%
Wheel Loader 73216
17.7%
Skid Steer Loaders 45011
10.9%
Motor Graders 26258
 
6.4%

Length

2023-02-19T19:06:35.002638image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-02-19T19:06:35.291479image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
ValueCountFrequency (%)
track 186812
19.6%
loaders 126412
13.3%
excavators 104230
10.9%
type 82582
8.7%
tractors 82582
8.7%
backhoe 81401
8.5%
wheel 73216
 
7.7%
loader 73216
 
7.7%
skid 45011
 
4.7%
steer 45011
 
4.7%
Other values (2) 52516
 
5.5%

Most occurring characters

ValueCountFrequency (%)
a 785141
12.1%
r 779619
12.0%
e 626323
9.7%
540291
 
8.3%
o 520357
 
8.0%
c 455025
 
7.0%
T 351976
 
5.4%
s 339482
 
5.2%
k 313224
 
4.8%
d 270897
 
4.2%
Other values (15) 1505562
23.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 4994617
77.0%
Uppercase Letter 952989
 
14.7%
Space Separator 540291
 
8.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 785141
15.7%
r 779619
15.6%
e 626323
12.5%
o 520357
10.4%
c 455025
9.1%
s 339482
6.8%
k 313224
 
6.3%
d 270897
 
5.4%
t 258081
 
5.2%
h 154617
 
3.1%
Other values (6) 491851
9.8%
Uppercase Letter
ValueCountFrequency (%)
T 351976
36.9%
L 199628
20.9%
E 104230
 
10.9%
S 90022
 
9.4%
B 81401
 
8.5%
W 73216
 
7.7%
M 26258
 
2.8%
G 26258
 
2.8%
Space Separator
ValueCountFrequency (%)
540291
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 5947606
91.7%
Common 540291
 
8.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 785141
13.2%
r 779619
13.1%
e 626323
10.5%
o 520357
8.7%
c 455025
 
7.7%
T 351976
 
5.9%
s 339482
 
5.7%
k 313224
 
5.3%
d 270897
 
4.6%
t 258081
 
4.3%
Other values (14) 1247481
21.0%
Common
ValueCountFrequency (%)
540291
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 6487897
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 785141
12.1%
r 779619
12.0%
e 626323
9.7%
540291
 
8.3%
o 520357
 
8.0%
c 455025
 
7.0%
T 351976
 
5.4%
s 339482
 
5.2%
k 313224
 
4.8%
d 270897
 
4.2%
Other values (15) 1505562
23.2%

Drive_System
Categorical

Distinct4
Distinct (%)< 0.1%
Missing305611
Missing (%)74.1%
Memory size3.1 MiB
Two Wheel Drive
47546 
Four Wheel Drive
33551 
No
25166 
All Wheel Drive
 
824

Length

Max length16
Median length15
Mean length12.258239
Min length2

Characters and Unicode

Total characters1312698
Distinct characters16
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowFour Wheel Drive
2nd rowFour Wheel Drive
3rd rowFour Wheel Drive
4th rowFour Wheel Drive
5th rowFour Wheel Drive

Common Values

ValueCountFrequency (%)
Two Wheel Drive 47546
 
11.5%
Four Wheel Drive 33551
 
8.1%
No 25166
 
6.1%
All Wheel Drive 824
 
0.2%
(Missing) 305611
74.1%

Length

2023-02-19T19:06:35.561455image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-02-19T19:06:35.807872image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
ValueCountFrequency (%)
wheel 81921
30.2%
drive 81921
30.2%
two 47546
17.5%
four 33551
12.4%
no 25166
 
9.3%
all 824
 
0.3%

Most occurring characters

ValueCountFrequency (%)
e 245763
18.7%
163842
12.5%
r 115472
8.8%
o 106263
8.1%
l 83569
 
6.4%
W 81921
 
6.2%
h 81921
 
6.2%
D 81921
 
6.2%
i 81921
 
6.2%
v 81921
 
6.2%
Other values (6) 188184
14.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 877927
66.9%
Uppercase Letter 270929
 
20.6%
Space Separator 163842
 
12.5%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 245763
28.0%
r 115472
13.2%
o 106263
12.1%
l 83569
 
9.5%
h 81921
 
9.3%
i 81921
 
9.3%
v 81921
 
9.3%
w 47546
 
5.4%
u 33551
 
3.8%
Uppercase Letter
ValueCountFrequency (%)
W 81921
30.2%
D 81921
30.2%
T 47546
17.5%
F 33551
12.4%
N 25166
 
9.3%
A 824
 
0.3%
Space Separator
ValueCountFrequency (%)
163842
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1148856
87.5%
Common 163842
 
12.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 245763
21.4%
r 115472
10.1%
o 106263
9.2%
l 83569
 
7.3%
W 81921
 
7.1%
h 81921
 
7.1%
D 81921
 
7.1%
i 81921
 
7.1%
v 81921
 
7.1%
T 47546
 
4.1%
Other values (5) 140638
12.2%
Common
ValueCountFrequency (%)
163842
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1312698
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 245763
18.7%
163842
12.5%
r 115472
8.8%
o 106263
8.1%
l 83569
 
6.4%
W 81921
 
6.2%
h 81921
 
6.2%
D 81921
 
6.2%
i 81921
 
6.2%
v 81921
 
6.2%
Other values (6) 188184
14.3%

Enclosure
Categorical

Distinct6
Distinct (%)< 0.1%
Missing334
Missing (%)0.1%
Memory size3.1 MiB
OROPS
177971 
EROPS
141769 
EROPS w AC
92601 
EROPS AC
 
18
NO ROPS
 
3

Length

Max length19
Median length5
Mean length6.12302
Min length5

Characters and Unicode

Total characters2524913
Distinct characters21
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowEROPS w AC
2nd rowEROPS w AC
3rd rowOROPS
4th rowEROPS w AC
5th rowEROPS

Common Values

ValueCountFrequency (%)
OROPS 177971
43.1%
EROPS 141769
34.4%
EROPS w AC 92601
22.4%
EROPS AC 18
 
< 0.1%
NO ROPS 3
 
< 0.1%
None or Unspecified 2
 
< 0.1%
(Missing) 334
 
0.1%

Length

2023-02-19T19:06:36.032751image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-02-19T19:06:36.316142image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
ValueCountFrequency (%)
erops 234388
39.2%
orops 177971
29.8%
ac 92619
 
15.5%
w 92601
 
15.5%
no 3
 
< 0.1%
rops 3
 
< 0.1%
none 2
 
< 0.1%
or 2
 
< 0.1%
unspecified 2
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
O 590336
23.4%
P 412362
16.3%
S 412362
16.3%
R 412362
16.3%
E 234388
 
9.3%
185227
 
7.3%
A 92619
 
3.7%
C 92619
 
3.7%
w 92601
 
3.7%
e 6
 
< 0.1%
Other values (11) 31
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 2247055
89.0%
Space Separator 185227
 
7.3%
Lowercase Letter 92631
 
3.7%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
w 92601
> 99.9%
e 6
 
< 0.1%
i 4
 
< 0.1%
o 4
 
< 0.1%
n 4
 
< 0.1%
r 2
 
< 0.1%
s 2
 
< 0.1%
p 2
 
< 0.1%
c 2
 
< 0.1%
f 2
 
< 0.1%
Uppercase Letter
ValueCountFrequency (%)
O 590336
26.3%
P 412362
18.4%
S 412362
18.4%
R 412362
18.4%
E 234388
 
10.4%
A 92619
 
4.1%
C 92619
 
4.1%
N 5
 
< 0.1%
U 2
 
< 0.1%
Space Separator
ValueCountFrequency (%)
185227
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 2339686
92.7%
Common 185227
 
7.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
O 590336
25.2%
P 412362
17.6%
S 412362
17.6%
R 412362
17.6%
E 234388
 
10.0%
A 92619
 
4.0%
C 92619
 
4.0%
w 92601
 
4.0%
e 6
 
< 0.1%
N 5
 
< 0.1%
Other values (10) 26
 
< 0.1%
Common
ValueCountFrequency (%)
185227
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2524913
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
O 590336
23.4%
P 412362
16.3%
S 412362
16.3%
R 412362
16.3%
E 234388
 
9.3%
185227
 
7.3%
A 92619
 
3.7%
C 92619
 
3.7%
w 92601
 
3.7%
e 6
 
< 0.1%
Other values (11) 31
 
< 0.1%

Forks
Categorical

IMBALANCE  MISSING 

Distinct2
Distinct (%)< 0.1%
Missing214983
Missing (%)52.1%
Memory size3.1 MiB
None or Unspecified
183061 
Yes
 
14654

Length

Max length19
Median length19
Mean length17.814131
Min length3

Characters and Unicode

Total characters3522121
Distinct characters14
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNone or Unspecified
2nd rowNone or Unspecified
3rd rowNone or Unspecified
4th rowNone or Unspecified
5th rowNone or Unspecified

Common Values

ValueCountFrequency (%)
None or Unspecified 183061
44.4%
Yes 14654
 
3.6%
(Missing) 214983
52.1%

Length

2023-02-19T19:06:36.575525image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-02-19T19:06:36.830000image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
ValueCountFrequency (%)
none 183061
32.5%
or 183061
32.5%
unspecified 183061
32.5%
yes 14654
 
2.6%

Most occurring characters

ValueCountFrequency (%)
e 563837
16.0%
o 366122
10.4%
n 366122
10.4%
366122
10.4%
i 366122
10.4%
s 197715
 
5.6%
N 183061
 
5.2%
r 183061
 
5.2%
U 183061
 
5.2%
p 183061
 
5.2%
Other values (4) 563837
16.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 2775223
78.8%
Uppercase Letter 380776
 
10.8%
Space Separator 366122
 
10.4%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 563837
20.3%
o 366122
13.2%
n 366122
13.2%
i 366122
13.2%
s 197715
 
7.1%
r 183061
 
6.6%
p 183061
 
6.6%
c 183061
 
6.6%
f 183061
 
6.6%
d 183061
 
6.6%
Uppercase Letter
ValueCountFrequency (%)
N 183061
48.1%
U 183061
48.1%
Y 14654
 
3.8%
Space Separator
ValueCountFrequency (%)
366122
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 3155999
89.6%
Common 366122
 
10.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 563837
17.9%
o 366122
11.6%
n 366122
11.6%
i 366122
11.6%
s 197715
 
6.3%
N 183061
 
5.8%
r 183061
 
5.8%
U 183061
 
5.8%
p 183061
 
5.8%
c 183061
 
5.8%
Other values (3) 380776
12.1%
Common
ValueCountFrequency (%)
366122
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3522121
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 563837
16.0%
o 366122
10.4%
n 366122
10.4%
366122
10.4%
i 366122
10.4%
s 197715
 
5.6%
N 183061
 
5.2%
r 183061
 
5.2%
U 183061
 
5.2%
p 183061
 
5.2%
Other values (4) 563837
16.0%

Pad_Type
Categorical

IMBALANCE  MISSING 

Distinct4
Distinct (%)< 0.1%
Missing331602
Missing (%)80.3%
Memory size3.1 MiB
None or Unspecified
72395 
Reversible
 
5950
Street
 
2725
Grouser
 
26

Length

Max length19
Median length19
Mean length17.898996
Min length6

Characters and Unicode

Total characters1451537
Distinct characters21
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNone or Unspecified
2nd rowReversible
3rd rowStreet
4th rowNone or Unspecified
5th rowNone or Unspecified

Common Values

ValueCountFrequency (%)
None or Unspecified 72395
 
17.5%
Reversible 5950
 
1.4%
Street 2725
 
0.7%
Grouser 26
 
< 0.1%
(Missing) 331602
80.3%

Length

2023-02-19T19:06:37.034951image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-02-19T19:06:37.304759image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
ValueCountFrequency (%)
none 72395
32.0%
or 72395
32.0%
unspecified 72395
32.0%
reversible 5950
 
2.6%
street 2725
 
1.2%
grouser 26
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
e 240511
16.6%
i 150740
10.4%
o 144816
10.0%
n 144790
10.0%
144790
10.0%
r 81122
 
5.6%
s 78371
 
5.4%
d 72395
 
5.0%
f 72395
 
5.0%
N 72395
 
5.0%
Other values (11) 249212
17.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1153256
79.5%
Uppercase Letter 153491
 
10.6%
Space Separator 144790
 
10.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 240511
20.9%
i 150740
13.1%
o 144816
12.6%
n 144790
12.6%
r 81122
 
7.0%
s 78371
 
6.8%
d 72395
 
6.3%
f 72395
 
6.3%
c 72395
 
6.3%
p 72395
 
6.3%
Other values (5) 23326
 
2.0%
Uppercase Letter
ValueCountFrequency (%)
N 72395
47.2%
U 72395
47.2%
R 5950
 
3.9%
S 2725
 
1.8%
G 26
 
< 0.1%
Space Separator
ValueCountFrequency (%)
144790
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1306747
90.0%
Common 144790
 
10.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 240511
18.4%
i 150740
11.5%
o 144816
11.1%
n 144790
11.1%
r 81122
 
6.2%
s 78371
 
6.0%
d 72395
 
5.5%
f 72395
 
5.5%
N 72395
 
5.5%
c 72395
 
5.5%
Other values (10) 176817
13.5%
Common
ValueCountFrequency (%)
144790
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1451537
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 240511
16.6%
i 150740
10.4%
o 144816
10.0%
n 144790
10.0%
144790
10.0%
r 81122
 
5.6%
s 78371
 
5.4%
d 72395
 
5.0%
f 72395
 
5.0%
N 72395
 
5.0%
Other values (11) 249212
17.2%

Ride_Control
Categorical

Distinct3
Distinct (%)< 0.1%
Missing259970
Missing (%)63.0%
Memory size3.1 MiB
No
79389 
None or Unspecified
64693 
Yes
8646 

Length

Max length19
Median length2
Mean length9.2575232
Min length2

Characters and Unicode

Total characters1413883
Distinct characters14
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNone or Unspecified
2nd rowNone or Unspecified
3rd rowNo
4th rowNo
5th rowNone or Unspecified

Common Values

ValueCountFrequency (%)
No 79389
 
19.2%
None or Unspecified 64693
 
15.7%
Yes 8646
 
2.1%
(Missing) 259970
63.0%

Length

2023-02-19T19:06:37.522072image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-02-19T19:06:37.771078image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
ValueCountFrequency (%)
no 79389
28.1%
none 64693
22.9%
or 64693
22.9%
unspecified 64693
22.9%
yes 8646
 
3.1%

Most occurring characters

ValueCountFrequency (%)
o 208775
14.8%
e 202725
14.3%
N 144082
10.2%
n 129386
9.2%
129386
9.2%
i 129386
9.2%
s 73339
 
5.2%
r 64693
 
4.6%
U 64693
 
4.6%
p 64693
 
4.6%
Other values (4) 202725
14.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1067076
75.5%
Uppercase Letter 217421
 
15.4%
Space Separator 129386
 
9.2%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o 208775
19.6%
e 202725
19.0%
n 129386
12.1%
i 129386
12.1%
s 73339
 
6.9%
r 64693
 
6.1%
p 64693
 
6.1%
c 64693
 
6.1%
f 64693
 
6.1%
d 64693
 
6.1%
Uppercase Letter
ValueCountFrequency (%)
N 144082
66.3%
U 64693
29.8%
Y 8646
 
4.0%
Space Separator
ValueCountFrequency (%)
129386
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1284497
90.8%
Common 129386
 
9.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
o 208775
16.3%
e 202725
15.8%
N 144082
11.2%
n 129386
10.1%
i 129386
10.1%
s 73339
 
5.7%
r 64693
 
5.0%
U 64693
 
5.0%
p 64693
 
5.0%
c 64693
 
5.0%
Other values (3) 138032
10.7%
Common
ValueCountFrequency (%)
129386
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1413883
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
o 208775
14.8%
e 202725
14.3%
N 144082
10.2%
n 129386
9.2%
129386
9.2%
i 129386
9.2%
s 73339
 
5.2%
r 64693
 
4.6%
U 64693
 
4.6%
p 64693
 
4.6%
Other values (4) 202725
14.3%

Stick
Categorical

Distinct2
Distinct (%)< 0.1%
Missing331602
Missing (%)80.3%
Memory size3.1 MiB
Standard
49854 
Extended
31242 

Length

Max length8
Median length8
Mean length8
Min length8

Characters and Unicode

Total characters648768
Distinct characters9
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowExtended
2nd rowStandard
3rd rowStandard
4th rowStandard
5th rowStandard

Common Values

ValueCountFrequency (%)
Standard 49854
 
12.1%
Extended 31242
 
7.6%
(Missing) 331602
80.3%

Length

2023-02-19T19:06:37.965014image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-02-19T19:06:38.188993image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
ValueCountFrequency (%)
standard 49854
61.5%
extended 31242
38.5%

Most occurring characters

ValueCountFrequency (%)
d 162192
25.0%
a 99708
15.4%
t 81096
12.5%
n 81096
12.5%
e 62484
 
9.6%
S 49854
 
7.7%
r 49854
 
7.7%
E 31242
 
4.8%
x 31242
 
4.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 567672
87.5%
Uppercase Letter 81096
 
12.5%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
d 162192
28.6%
a 99708
17.6%
t 81096
14.3%
n 81096
14.3%
e 62484
 
11.0%
r 49854
 
8.8%
x 31242
 
5.5%
Uppercase Letter
ValueCountFrequency (%)
S 49854
61.5%
E 31242
38.5%

Most occurring scripts

ValueCountFrequency (%)
Latin 648768
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
d 162192
25.0%
a 99708
15.4%
t 81096
12.5%
n 81096
12.5%
e 62484
 
9.6%
S 49854
 
7.7%
r 49854
 
7.7%
E 31242
 
4.8%
x 31242
 
4.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 648768
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
d 162192
25.0%
a 99708
15.4%
t 81096
12.5%
n 81096
12.5%
e 62484
 
9.6%
S 49854
 
7.7%
r 49854
 
7.7%
E 31242
 
4.8%
x 31242
 
4.8%

Transmission
Categorical

IMBALANCE  MISSING 

Distinct8
Distinct (%)< 0.1%
Missing224691
Missing (%)54.4%
Memory size3.1 MiB
Standard
143915 
None or Unspecified
23889 
Powershift
 
11991
Powershuttle
 
4286
Hydrostatic
 
3342
Other values (3)
 
584

Length

Max length19
Median length8
Mean length9.6796236
Min length8

Characters and Unicode

Total characters1819837
Distinct characters26
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowPowershuttle
2nd rowStandard
3rd rowStandard
4th rowStandard
5th rowStandard

Common Values

ValueCountFrequency (%)
Standard 143915
34.9%
None or Unspecified 23889
 
5.8%
Powershift 11991
 
2.9%
Powershuttle 4286
 
1.0%
Hydrostatic 3342
 
0.8%
Direct Drive 422
 
0.1%
Autoshift 118
 
< 0.1%
AutoShift 44
 
< 0.1%
(Missing) 224691
54.4%

Length

2023-02-19T19:06:38.403088image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-02-19T19:06:38.685995image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
ValueCountFrequency (%)
standard 143915
60.9%
none 23889
 
10.1%
or 23889
 
10.1%
unspecified 23889
 
10.1%
powershift 11991
 
5.1%
powershuttle 4286
 
1.8%
hydrostatic 3342
 
1.4%
direct 422
 
0.2%
drive 422
 
0.2%
autoshift 162
 
0.1%

Most occurring characters

ValueCountFrequency (%)
d 315061
17.3%
a 291172
16.0%
n 191693
10.5%
r 188267
10.3%
t 171908
9.4%
S 143959
7.9%
e 93074
 
5.1%
o 67559
 
3.7%
i 64117
 
3.5%
48200
 
2.6%
Other values (16) 244827
13.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1559275
85.7%
Uppercase Letter 212362
 
11.7%
Space Separator 48200
 
2.6%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
d 315061
20.2%
a 291172
18.7%
n 191693
12.3%
r 188267
12.1%
t 171908
11.0%
e 93074
 
6.0%
o 67559
 
4.3%
i 64117
 
4.1%
s 43626
 
2.8%
f 36042
 
2.3%
Other values (8) 96756
 
6.2%
Uppercase Letter
ValueCountFrequency (%)
S 143959
67.8%
U 23889
 
11.2%
N 23889
 
11.2%
P 16277
 
7.7%
H 3342
 
1.6%
D 844
 
0.4%
A 162
 
0.1%
Space Separator
ValueCountFrequency (%)
48200
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1771637
97.4%
Common 48200
 
2.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
d 315061
17.8%
a 291172
16.4%
n 191693
10.8%
r 188267
10.6%
t 171908
9.7%
S 143959
8.1%
e 93074
 
5.3%
o 67559
 
3.8%
i 64117
 
3.6%
s 43626
 
2.5%
Other values (15) 201201
11.4%
Common
ValueCountFrequency (%)
48200
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1819837
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
d 315061
17.3%
a 291172
16.0%
n 191693
10.5%
r 188267
10.3%
t 171908
9.4%
S 143959
7.9%
e 93074
 
5.1%
o 67559
 
3.7%
i 64117
 
3.5%
48200
 
2.6%
Other values (16) 244827
13.5%

Turbocharged
Categorical

IMBALANCE  MISSING 

Distinct2
Distinct (%)< 0.1%
Missing331602
Missing (%)80.3%
Memory size3.1 MiB
None or Unspecified
77111 
Yes
 
3985

Length

Max length19
Median length19
Mean length18.213771
Min length3

Characters and Unicode

Total characters1477064
Distinct characters14
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNone or Unspecified
2nd rowYes
3rd rowNone or Unspecified
4th rowNone or Unspecified
5th rowNone or Unspecified

Common Values

ValueCountFrequency (%)
None or Unspecified 77111
 
18.7%
Yes 3985
 
1.0%
(Missing) 331602
80.3%

Length

2023-02-19T19:06:38.945053image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-02-19T19:06:39.193437image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
ValueCountFrequency (%)
none 77111
32.8%
or 77111
32.8%
unspecified 77111
32.8%
yes 3985
 
1.7%

Most occurring characters

ValueCountFrequency (%)
e 235318
15.9%
o 154222
10.4%
n 154222
10.4%
154222
10.4%
i 154222
10.4%
s 81096
 
5.5%
N 77111
 
5.2%
r 77111
 
5.2%
U 77111
 
5.2%
p 77111
 
5.2%
Other values (4) 235318
15.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1164635
78.8%
Uppercase Letter 158207
 
10.7%
Space Separator 154222
 
10.4%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 235318
20.2%
o 154222
13.2%
n 154222
13.2%
i 154222
13.2%
s 81096
 
7.0%
r 77111
 
6.6%
p 77111
 
6.6%
c 77111
 
6.6%
f 77111
 
6.6%
d 77111
 
6.6%
Uppercase Letter
ValueCountFrequency (%)
N 77111
48.7%
U 77111
48.7%
Y 3985
 
2.5%
Space Separator
ValueCountFrequency (%)
154222
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1322842
89.6%
Common 154222
 
10.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 235318
17.8%
o 154222
11.7%
n 154222
11.7%
i 154222
11.7%
s 81096
 
6.1%
N 77111
 
5.8%
r 77111
 
5.8%
U 77111
 
5.8%
p 77111
 
5.8%
c 77111
 
5.8%
Other values (3) 158207
12.0%
Common
ValueCountFrequency (%)
154222
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1477064
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 235318
15.9%
o 154222
10.4%
n 154222
10.4%
154222
10.4%
i 154222
10.4%
s 81096
 
5.5%
N 77111
 
5.2%
r 77111
 
5.2%
U 77111
 
5.2%
p 77111
 
5.2%
Other values (4) 235318
15.9%

Blade_Extension
Categorical

IMBALANCE  MISSING 

Distinct2
Distinct (%)< 0.1%
Missing386715
Missing (%)93.7%
Memory size3.1 MiB
None or Unspecified
25406 
Yes
 
577

Length

Max length19
Median length19
Mean length18.644691
Min length3

Characters and Unicode

Total characters484445
Distinct characters14
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowYes
2nd rowNone or Unspecified
3rd rowNone or Unspecified
4th rowNone or Unspecified
5th rowNone or Unspecified

Common Values

ValueCountFrequency (%)
None or Unspecified 25406
 
6.2%
Yes 577
 
0.1%
(Missing) 386715
93.7%

Length

2023-02-19T19:06:39.394916image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-02-19T19:06:39.623513image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
ValueCountFrequency (%)
none 25406
33.1%
or 25406
33.1%
unspecified 25406
33.1%
yes 577
 
0.8%

Most occurring characters

ValueCountFrequency (%)
e 76795
15.9%
o 50812
10.5%
n 50812
10.5%
50812
10.5%
i 50812
10.5%
s 25983
 
5.4%
N 25406
 
5.2%
r 25406
 
5.2%
U 25406
 
5.2%
p 25406
 
5.2%
Other values (4) 76795
15.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 382244
78.9%
Uppercase Letter 51389
 
10.6%
Space Separator 50812
 
10.5%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 76795
20.1%
o 50812
13.3%
n 50812
13.3%
i 50812
13.3%
s 25983
 
6.8%
r 25406
 
6.6%
p 25406
 
6.6%
c 25406
 
6.6%
f 25406
 
6.6%
d 25406
 
6.6%
Uppercase Letter
ValueCountFrequency (%)
N 25406
49.4%
U 25406
49.4%
Y 577
 
1.1%
Space Separator
ValueCountFrequency (%)
50812
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 433633
89.5%
Common 50812
 
10.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 76795
17.7%
o 50812
11.7%
n 50812
11.7%
i 50812
11.7%
s 25983
 
6.0%
N 25406
 
5.9%
r 25406
 
5.9%
U 25406
 
5.9%
p 25406
 
5.9%
c 25406
 
5.9%
Other values (3) 51389
11.9%
Common
ValueCountFrequency (%)
50812
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 484445
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 76795
15.9%
o 50812
10.5%
n 50812
10.5%
50812
10.5%
i 50812
10.5%
s 25983
 
5.4%
N 25406
 
5.2%
r 25406
 
5.2%
U 25406
 
5.2%
p 25406
 
5.2%
Other values (4) 76795
15.9%

Blade_Width
Categorical

Distinct6
Distinct (%)< 0.1%
Missing386715
Missing (%)93.7%
Memory size3.1 MiB
14'
9867 
None or Unspecified
9521 
12'
5201 
16'
 
960
13'
 
335

Length

Max length19
Median length3
Mean length8.8667205
Min length3

Characters and Unicode

Total characters230384
Distinct characters20
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNone or Unspecified
2nd rowNone or Unspecified
3rd row12'
4th row14'
5th row14'

Common Values

ValueCountFrequency (%)
14' 9867
 
2.4%
None or Unspecified 9521
 
2.3%
12' 5201
 
1.3%
16' 960
 
0.2%
13' 335
 
0.1%
<12' 99
 
< 0.1%
(Missing) 386715
93.7%

Length

2023-02-19T19:06:39.816444image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-02-19T19:06:40.075296image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
ValueCountFrequency (%)
14 9867
21.9%
none 9521
21.1%
or 9521
21.1%
unspecified 9521
21.1%
12 5300
11.8%
16 960
 
2.1%
13 335
 
0.7%

Most occurring characters

ValueCountFrequency (%)
e 28563
12.4%
o 19042
 
8.3%
n 19042
 
8.3%
19042
 
8.3%
i 19042
 
8.3%
1 16462
 
7.1%
' 16462
 
7.1%
4 9867
 
4.3%
c 9521
 
4.1%
d 9521
 
4.1%
Other values (10) 63820
27.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 142815
62.0%
Decimal Number 32924
 
14.3%
Space Separator 19042
 
8.3%
Uppercase Letter 19042
 
8.3%
Other Punctuation 16462
 
7.1%
Math Symbol 99
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 28563
20.0%
o 19042
13.3%
n 19042
13.3%
i 19042
13.3%
c 9521
 
6.7%
d 9521
 
6.7%
f 9521
 
6.7%
s 9521
 
6.7%
p 9521
 
6.7%
r 9521
 
6.7%
Decimal Number
ValueCountFrequency (%)
1 16462
50.0%
4 9867
30.0%
2 5300
 
16.1%
6 960
 
2.9%
3 335
 
1.0%
Uppercase Letter
ValueCountFrequency (%)
U 9521
50.0%
N 9521
50.0%
Space Separator
ValueCountFrequency (%)
19042
100.0%
Other Punctuation
ValueCountFrequency (%)
' 16462
100.0%
Math Symbol
ValueCountFrequency (%)
< 99
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 161857
70.3%
Common 68527
29.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 28563
17.6%
o 19042
11.8%
n 19042
11.8%
i 19042
11.8%
c 9521
 
5.9%
d 9521
 
5.9%
f 9521
 
5.9%
s 9521
 
5.9%
p 9521
 
5.9%
U 9521
 
5.9%
Other values (2) 19042
11.8%
Common
ValueCountFrequency (%)
19042
27.8%
1 16462
24.0%
' 16462
24.0%
4 9867
14.4%
2 5300
 
7.7%
6 960
 
1.4%
3 335
 
0.5%
< 99
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 230384
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 28563
12.4%
o 19042
 
8.3%
n 19042
 
8.3%
19042
 
8.3%
i 19042
 
8.3%
1 16462
 
7.1%
' 16462
 
7.1%
4 9867
 
4.3%
c 9521
 
4.1%
d 9521
 
4.1%
Other values (10) 63820
27.7%

Enclosure_Type
Categorical

IMBALANCE  MISSING 

Distinct3
Distinct (%)< 0.1%
Missing386715
Missing (%)93.7%
Memory size3.1 MiB
None or Unspecified
22469 
Low Profile
2675 
High Profile
 
839

Length

Max length19
Median length19
Mean length17.950352
Min length11

Characters and Unicode

Total characters466404
Distinct characters20
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNone or Unspecified
2nd rowNone or Unspecified
3rd rowNone or Unspecified
4th rowNone or Unspecified
5th rowLow Profile

Common Values

ValueCountFrequency (%)
None or Unspecified 22469
 
5.4%
Low Profile 2675
 
0.6%
High Profile 839
 
0.2%
(Missing) 386715
93.7%

Length

2023-02-19T19:06:40.306162image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-02-19T19:06:40.553619image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
ValueCountFrequency (%)
none 22469
30.2%
or 22469
30.2%
unspecified 22469
30.2%
profile 3514
 
4.7%
low 2675
 
3.6%
high 839
 
1.1%

Most occurring characters

ValueCountFrequency (%)
e 70921
15.2%
o 51127
11.0%
i 49291
10.6%
48452
10.4%
n 44938
9.6%
r 25983
 
5.6%
f 25983
 
5.6%
c 22469
 
4.8%
d 22469
 
4.8%
N 22469
 
4.8%
Other values (10) 82302
17.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 365986
78.5%
Uppercase Letter 51966
 
11.1%
Space Separator 48452
 
10.4%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 70921
19.4%
o 51127
14.0%
i 49291
13.5%
n 44938
12.3%
r 25983
 
7.1%
f 25983
 
7.1%
c 22469
 
6.1%
d 22469
 
6.1%
p 22469
 
6.1%
s 22469
 
6.1%
Other values (4) 7867
 
2.1%
Uppercase Letter
ValueCountFrequency (%)
N 22469
43.2%
U 22469
43.2%
P 3514
 
6.8%
L 2675
 
5.1%
H 839
 
1.6%
Space Separator
ValueCountFrequency (%)
48452
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 417952
89.6%
Common 48452
 
10.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 70921
17.0%
o 51127
12.2%
i 49291
11.8%
n 44938
10.8%
r 25983
 
6.2%
f 25983
 
6.2%
c 22469
 
5.4%
d 22469
 
5.4%
N 22469
 
5.4%
p 22469
 
5.4%
Other values (9) 59833
14.3%
Common
ValueCountFrequency (%)
48452
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 466404
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 70921
15.2%
o 51127
11.0%
i 49291
10.6%
48452
10.4%
n 44938
9.6%
r 25983
 
5.6%
f 25983
 
5.6%
c 22469
 
4.8%
d 22469
 
4.8%
N 22469
 
4.8%
Other values (10) 82302
17.6%

Engine_Horsepower
Categorical

IMBALANCE  MISSING 

Distinct2
Distinct (%)< 0.1%
Missing386715
Missing (%)93.7%
Memory size3.1 MiB
No
24642 
Variable
 
1341

Length

Max length8
Median length2
Mean length2.309664
Min length2

Characters and Unicode

Total characters60012
Distinct characters9
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNo
2nd rowNo
3rd rowNo
4th rowNo
5th rowVariable

Common Values

ValueCountFrequency (%)
No 24642
 
6.0%
Variable 1341
 
0.3%
(Missing) 386715
93.7%

Length

2023-02-19T19:06:40.756075image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-02-19T19:06:40.979939image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
ValueCountFrequency (%)
no 24642
94.8%
variable 1341
 
5.2%

Most occurring characters

ValueCountFrequency (%)
N 24642
41.1%
o 24642
41.1%
a 2682
 
4.5%
V 1341
 
2.2%
r 1341
 
2.2%
i 1341
 
2.2%
b 1341
 
2.2%
l 1341
 
2.2%
e 1341
 
2.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 34029
56.7%
Uppercase Letter 25983
43.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o 24642
72.4%
a 2682
 
7.9%
r 1341
 
3.9%
i 1341
 
3.9%
b 1341
 
3.9%
l 1341
 
3.9%
e 1341
 
3.9%
Uppercase Letter
ValueCountFrequency (%)
N 24642
94.8%
V 1341
 
5.2%

Most occurring scripts

ValueCountFrequency (%)
Latin 60012
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
N 24642
41.1%
o 24642
41.1%
a 2682
 
4.5%
V 1341
 
2.2%
r 1341
 
2.2%
i 1341
 
2.2%
b 1341
 
2.2%
l 1341
 
2.2%
e 1341
 
2.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 60012
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
N 24642
41.1%
o 24642
41.1%
a 2682
 
4.5%
V 1341
 
2.2%
r 1341
 
2.2%
i 1341
 
2.2%
b 1341
 
2.2%
l 1341
 
2.2%
e 1341
 
2.2%

Hydraulics
Categorical

Distinct12
Distinct (%)< 0.1%
Missing82565
Missing (%)20.0%
Memory size3.1 MiB
2 Valve
145317 
Standard
106515 
Auxiliary
43224 
Base + 1 Function
25511 
3 Valve
 
5807
Other values (7)
 
3759

Length

Max length19
Median length17
Mean length8.3779689
Min length7

Characters and Unicode

Total characters2765844
Distinct characters32
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2 Valve
2nd row2 Valve
3rd rowAuxiliary
4th row2 Valve
5th rowAuxiliary

Common Values

ValueCountFrequency (%)
2 Valve 145317
35.2%
Standard 106515
25.8%
Auxiliary 43224
 
10.5%
Base + 1 Function 25511
 
6.2%
3 Valve 5807
 
1.4%
4 Valve 3077
 
0.7%
Base + 3 Function 311
 
0.1%
Base + 2 Function 132
 
< 0.1%
Base + 5 Function 94
 
< 0.1%
Base + 4 Function 81
 
< 0.1%
Other values (2) 64
 
< 0.1%
(Missing) 82565
20.0%

Length

2023-02-19T19:06:41.161544image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
valve 154201
27.4%
2 145449
25.8%
standard 106515
18.9%
auxiliary 43224
 
7.7%
base 26183
 
4.7%
26183
 
4.7%
function 26183
 
4.7%
1 25511
 
4.5%
3 6118
 
1.1%
4 3158
 
0.6%
Other values (5) 178
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
a 436638
15.8%
232770
 
8.4%
d 213040
 
7.7%
l 197425
 
7.1%
e 180414
 
6.5%
n 158901
 
5.7%
V 154201
 
5.6%
v 154201
 
5.6%
r 149749
 
5.4%
2 145449
 
5.3%
Other values (22) 743056
26.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1970181
71.2%
Uppercase Letter 356326
 
12.9%
Space Separator 232770
 
8.4%
Decimal Number 180384
 
6.5%
Math Symbol 26183
 
0.9%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 436638
22.2%
d 213040
10.8%
l 197425
10.0%
e 180414
9.2%
n 158901
 
8.1%
v 154201
 
7.8%
r 149749
 
7.6%
t 132698
 
6.7%
i 112651
 
5.7%
u 69407
 
3.5%
Other values (7) 165057
 
8.4%
Uppercase Letter
ValueCountFrequency (%)
V 154201
43.3%
S 106515
29.9%
A 43224
 
12.1%
F 26183
 
7.3%
B 26183
 
7.3%
N 10
 
< 0.1%
U 10
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
2 145449
80.6%
1 25511
 
14.1%
3 6118
 
3.4%
4 3158
 
1.8%
5 94
 
0.1%
6 54
 
< 0.1%
Space Separator
ValueCountFrequency (%)
232770
100.0%
Math Symbol
ValueCountFrequency (%)
+ 26183
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 2326507
84.1%
Common 439337
 
15.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 436638
18.8%
d 213040
9.2%
l 197425
8.5%
e 180414
7.8%
n 158901
 
6.8%
V 154201
 
6.6%
v 154201
 
6.6%
r 149749
 
6.4%
t 132698
 
5.7%
i 112651
 
4.8%
Other values (14) 436589
18.8%
Common
ValueCountFrequency (%)
232770
53.0%
2 145449
33.1%
+ 26183
 
6.0%
1 25511
 
5.8%
3 6118
 
1.4%
4 3158
 
0.7%
5 94
 
< 0.1%
6 54
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2765844
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 436638
15.8%
232770
 
8.4%
d 213040
 
7.7%
l 197425
 
7.1%
e 180414
 
6.5%
n 158901
 
5.7%
V 154201
 
5.6%
v 154201
 
5.6%
r 149749
 
5.4%
2 145449
 
5.3%
Other values (22) 743056
26.9%

Pushblock
Categorical

Distinct2
Distinct (%)< 0.1%
Missing386715
Missing (%)93.7%
Memory size3.1 MiB
None or Unspecified
20017 
Yes
5966 

Length

Max length19
Median length19
Mean length15.326213
Min length3

Characters and Unicode

Total characters398221
Distinct characters14
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNone or Unspecified
2nd rowNone or Unspecified
3rd rowYes
4th rowNone or Unspecified
5th rowYes

Common Values

ValueCountFrequency (%)
None or Unspecified 20017
 
4.9%
Yes 5966
 
1.4%
(Missing) 386715
93.7%

Length

2023-02-19T19:06:41.386998image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-02-19T19:06:41.610866image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
ValueCountFrequency (%)
none 20017
30.3%
or 20017
30.3%
unspecified 20017
30.3%
yes 5966
 
9.0%

Most occurring characters

ValueCountFrequency (%)
e 66017
16.6%
o 40034
10.1%
n 40034
10.1%
40034
10.1%
i 40034
10.1%
s 25983
 
6.5%
N 20017
 
5.0%
r 20017
 
5.0%
U 20017
 
5.0%
p 20017
 
5.0%
Other values (4) 66017
16.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 312187
78.4%
Uppercase Letter 46000
 
11.6%
Space Separator 40034
 
10.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 66017
21.1%
o 40034
12.8%
n 40034
12.8%
i 40034
12.8%
s 25983
 
8.3%
r 20017
 
6.4%
p 20017
 
6.4%
c 20017
 
6.4%
f 20017
 
6.4%
d 20017
 
6.4%
Uppercase Letter
ValueCountFrequency (%)
N 20017
43.5%
U 20017
43.5%
Y 5966
 
13.0%
Space Separator
ValueCountFrequency (%)
40034
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 358187
89.9%
Common 40034
 
10.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 66017
18.4%
o 40034
11.2%
n 40034
11.2%
i 40034
11.2%
s 25983
 
7.3%
N 20017
 
5.6%
r 20017
 
5.6%
U 20017
 
5.6%
p 20017
 
5.6%
c 20017
 
5.6%
Other values (3) 46000
12.8%
Common
ValueCountFrequency (%)
40034
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 398221
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 66017
16.6%
o 40034
10.1%
n 40034
10.1%
40034
10.1%
i 40034
10.1%
s 25983
 
6.5%
N 20017
 
5.0%
r 20017
 
5.0%
U 20017
 
5.0%
p 20017
 
5.0%
Other values (4) 66017
16.6%

Ripper
Categorical

Distinct4
Distinct (%)< 0.1%
Missing305753
Missing (%)74.1%
Memory size3.1 MiB
None or Unspecified
85405 
Yes
 
8185
Multi Shank
 
8071
Single Shank
 
5284

Length

Max length19
Median length19
Mean length16.825836
Min length3

Characters and Unicode

Total characters1799439
Distinct characters23
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNone or Unspecified
2nd rowNone or Unspecified
3rd rowNone or Unspecified
4th rowNone or Unspecified
5th rowNone or Unspecified

Common Values

ValueCountFrequency (%)
None or Unspecified 85405
 
20.7%
Yes 8185
 
2.0%
Multi Shank 8071
 
2.0%
Single Shank 5284
 
1.3%
(Missing) 305753
74.1%

Length

2023-02-19T19:06:41.809759image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-02-19T19:06:42.054140image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
ValueCountFrequency (%)
none 85405
29.3%
or 85405
29.3%
unspecified 85405
29.3%
shank 13355
 
4.6%
yes 8185
 
2.8%
multi 8071
 
2.8%
single 5284
 
1.8%

Most occurring characters

ValueCountFrequency (%)
e 269684
15.0%
n 189449
10.5%
184165
10.2%
i 184165
10.2%
o 170810
9.5%
s 93590
 
5.2%
N 85405
 
4.7%
c 85405
 
4.7%
d 85405
 
4.7%
f 85405
 
4.7%
Other values (13) 365956
20.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1409569
78.3%
Uppercase Letter 205705
 
11.4%
Space Separator 184165
 
10.2%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 269684
19.1%
n 189449
13.4%
i 184165
13.1%
o 170810
12.1%
s 93590
 
6.6%
c 85405
 
6.1%
d 85405
 
6.1%
f 85405
 
6.1%
p 85405
 
6.1%
r 85405
 
6.1%
Other values (7) 74846
 
5.3%
Uppercase Letter
ValueCountFrequency (%)
N 85405
41.5%
U 85405
41.5%
S 18639
 
9.1%
Y 8185
 
4.0%
M 8071
 
3.9%
Space Separator
ValueCountFrequency (%)
184165
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1615274
89.8%
Common 184165
 
10.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 269684
16.7%
n 189449
11.7%
i 184165
11.4%
o 170810
10.6%
s 93590
 
5.8%
N 85405
 
5.3%
c 85405
 
5.3%
d 85405
 
5.3%
f 85405
 
5.3%
p 85405
 
5.3%
Other values (12) 280551
17.4%
Common
ValueCountFrequency (%)
184165
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1799439
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 269684
15.0%
n 189449
10.5%
184165
10.2%
i 184165
10.2%
o 170810
9.5%
s 93590
 
5.2%
N 85405
 
4.7%
c 85405
 
4.7%
d 85405
 
4.7%
f 85405
 
4.7%
Other values (13) 365956
20.3%

Scarifier
Categorical

Distinct2
Distinct (%)< 0.1%
Missing386704
Missing (%)93.7%
Memory size3.1 MiB
None or Unspecified
13033 
Yes
12961 

Length

Max length19
Median length19
Mean length11.022159
Min length3

Characters and Unicode

Total characters286510
Distinct characters14
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowYes
2nd rowYes
3rd rowYes
4th rowYes
5th rowNone or Unspecified

Common Values

ValueCountFrequency (%)
None or Unspecified 13033
 
3.2%
Yes 12961
 
3.1%
(Missing) 386704
93.7%

Length

2023-02-19T19:06:42.272016image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-02-19T19:06:42.500874image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
ValueCountFrequency (%)
none 13033
25.0%
or 13033
25.0%
unspecified 13033
25.0%
yes 12961
24.9%

Most occurring characters

ValueCountFrequency (%)
e 52060
18.2%
o 26066
9.1%
n 26066
9.1%
26066
9.1%
i 26066
9.1%
s 25994
9.1%
N 13033
 
4.5%
r 13033
 
4.5%
U 13033
 
4.5%
p 13033
 
4.5%
Other values (4) 52060
18.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 221417
77.3%
Uppercase Letter 39027
 
13.6%
Space Separator 26066
 
9.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 52060
23.5%
o 26066
11.8%
n 26066
11.8%
i 26066
11.8%
s 25994
11.7%
r 13033
 
5.9%
p 13033
 
5.9%
c 13033
 
5.9%
f 13033
 
5.9%
d 13033
 
5.9%
Uppercase Letter
ValueCountFrequency (%)
N 13033
33.4%
U 13033
33.4%
Y 12961
33.2%
Space Separator
ValueCountFrequency (%)
26066
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 260444
90.9%
Common 26066
 
9.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 52060
20.0%
o 26066
10.0%
n 26066
10.0%
i 26066
10.0%
s 25994
10.0%
N 13033
 
5.0%
r 13033
 
5.0%
U 13033
 
5.0%
p 13033
 
5.0%
c 13033
 
5.0%
Other values (3) 39027
15.0%
Common
ValueCountFrequency (%)
26066
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 286510
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 52060
18.2%
o 26066
9.1%
n 26066
9.1%
26066
9.1%
i 26066
9.1%
s 25994
9.1%
N 13033
 
4.5%
r 13033
 
4.5%
U 13033
 
4.5%
p 13033
 
4.5%
Other values (4) 52060
18.2%

Tip_Control
Categorical

Distinct3
Distinct (%)< 0.1%
Missing386715
Missing (%)93.7%
Memory size3.1 MiB
None or Unspecified
16832 
Sideshift & Tip
7164 
Tip
1987 

Length

Max length19
Median length19
Mean length16.673556
Min length3

Characters and Unicode

Total characters433229
Distinct characters18
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowSideshift & Tip
2nd rowNone or Unspecified
3rd rowNone or Unspecified
4th rowNone or Unspecified
5th rowNone or Unspecified

Common Values

ValueCountFrequency (%)
None or Unspecified 16832
 
4.1%
Sideshift & Tip 7164
 
1.7%
Tip 1987
 
0.5%
(Missing) 386715
93.7%

Length

2023-02-19T19:06:42.691776image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-02-19T19:06:42.933709image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
ValueCountFrequency (%)
none 16832
22.8%
or 16832
22.8%
unspecified 16832
22.8%
tip 9151
12.4%
sideshift 7164
9.7%
7164
9.7%

Most occurring characters

ValueCountFrequency (%)
e 57660
13.3%
i 57143
13.2%
47992
11.1%
n 33664
 
7.8%
o 33664
 
7.8%
p 25983
 
6.0%
d 23996
 
5.5%
f 23996
 
5.5%
s 23996
 
5.5%
N 16832
 
3.9%
Other values (8) 88303
20.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 328094
75.7%
Uppercase Letter 49979
 
11.5%
Space Separator 47992
 
11.1%
Other Punctuation 7164
 
1.7%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 57660
17.6%
i 57143
17.4%
n 33664
10.3%
o 33664
10.3%
p 25983
7.9%
d 23996
7.3%
f 23996
7.3%
s 23996
7.3%
c 16832
 
5.1%
r 16832
 
5.1%
Other values (2) 14328
 
4.4%
Uppercase Letter
ValueCountFrequency (%)
N 16832
33.7%
U 16832
33.7%
T 9151
18.3%
S 7164
14.3%
Space Separator
ValueCountFrequency (%)
47992
100.0%
Other Punctuation
ValueCountFrequency (%)
& 7164
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 378073
87.3%
Common 55156
 
12.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 57660
15.3%
i 57143
15.1%
n 33664
8.9%
o 33664
8.9%
p 25983
6.9%
d 23996
 
6.3%
f 23996
 
6.3%
s 23996
 
6.3%
N 16832
 
4.5%
c 16832
 
4.5%
Other values (6) 64307
17.0%
Common
ValueCountFrequency (%)
47992
87.0%
& 7164
 
13.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 433229
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 57660
13.3%
i 57143
13.2%
47992
11.1%
n 33664
 
7.8%
o 33664
 
7.8%
p 25983
 
6.0%
d 23996
 
5.5%
f 23996
 
5.5%
s 23996
 
5.5%
N 16832
 
3.9%
Other values (8) 88303
20.4%

Tire_Size
Categorical

Distinct17
Distinct (%)< 0.1%
Missing315060
Missing (%)76.3%
Memory size3.1 MiB
None or Unspecified
47823 
20.5
15773 
14"
9111 
23.5
8760 
26.5
 
4635
Other values (12)
11536 

Length

Max length19
Median length7
Mean length11.279973
Min length3

Characters and Unicode

Total characters1101354
Distinct characters25
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNone or Unspecified
2nd row23.5
3rd rowNone or Unspecified
4th row13"
5th row26.5

Common Values

ValueCountFrequency (%)
None or Unspecified 47823
 
11.6%
20.5 15773
 
3.8%
14" 9111
 
2.2%
23.5 8760
 
2.1%
26.5 4635
 
1.1%
17.5 3971
 
1.0%
29.5 2767
 
0.7%
17.5" 1815
 
0.4%
13" 776
 
0.2%
20.5" 737
 
0.2%
Other values (7) 1470
 
0.4%
(Missing) 315060
76.3%

Length

2023-02-19T19:06:43.142146image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
none 47823
24.7%
or 47823
24.7%
unspecified 47823
24.7%
20.5 16510
 
8.5%
14 9111
 
4.7%
23.5 9069
 
4.7%
17.5 5786
 
3.0%
26.5 4635
 
2.4%
29.5 2767
 
1.4%
15.5 1073
 
0.6%
Other values (5) 867
 
0.4%

Most occurring characters

ValueCountFrequency (%)
e 143469
13.0%
n 95649
 
8.7%
95649
 
8.7%
i 95649
 
8.7%
o 95646
 
8.7%
c 47826
 
4.3%
N 47823
 
4.3%
f 47823
 
4.3%
d 47823
 
4.3%
p 47823
 
4.3%
Other values (15) 336174
30.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 717357
65.1%
Decimal Number 139490
 
12.7%
Space Separator 95649
 
8.7%
Uppercase Letter 95646
 
8.7%
Other Punctuation 53212
 
4.8%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 143469
20.0%
n 95649
13.3%
i 95649
13.3%
o 95646
13.3%
c 47826
 
6.7%
f 47823
 
6.7%
d 47823
 
6.7%
p 47823
 
6.7%
s 47823
 
6.7%
r 47823
 
6.7%
Decimal Number
ValueCountFrequency (%)
5 40913
29.3%
2 33001
23.7%
1 16778
12.0%
0 16578
11.9%
3 9865
 
7.1%
4 9111
 
6.5%
7 5842
 
4.2%
6 4635
 
3.3%
9 2767
 
2.0%
Uppercase Letter
ValueCountFrequency (%)
N 47823
50.0%
U 47823
50.0%
Other Punctuation
ValueCountFrequency (%)
. 39916
75.0%
" 13296
 
25.0%
Space Separator
ValueCountFrequency (%)
95649
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 813003
73.8%
Common 288351
 
26.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 143469
17.6%
n 95649
11.8%
i 95649
11.8%
o 95646
11.8%
c 47826
 
5.9%
N 47823
 
5.9%
f 47823
 
5.9%
d 47823
 
5.9%
p 47823
 
5.9%
s 47823
 
5.9%
Other values (3) 95649
11.8%
Common
ValueCountFrequency (%)
95649
33.2%
5 40913
14.2%
. 39916
13.8%
2 33001
 
11.4%
1 16778
 
5.8%
0 16578
 
5.7%
" 13296
 
4.6%
3 9865
 
3.4%
4 9111
 
3.2%
7 5842
 
2.0%
Other values (2) 7402
 
2.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1101354
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 143469
13.0%
n 95649
 
8.7%
95649
 
8.7%
i 95649
 
8.7%
o 95646
 
8.7%
c 47826
 
4.3%
N 47823
 
4.3%
f 47823
 
4.3%
d 47823
 
4.3%
p 47823
 
4.3%
Other values (15) 336174
30.5%

Coupler
Categorical

IMBALANCE  MISSING 

Distinct3
Distinct (%)< 0.1%
Missing192019
Missing (%)46.5%
Memory size3.1 MiB
None or Unspecified
190449 
Manual
23918 
Hydraulic
 
6312

Length

Max length19
Median length19
Mean length17.304986
Min length6

Characters and Unicode

Total characters3818847
Distinct characters19
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNone or Unspecified
2nd rowNone or Unspecified
3rd rowNone or Unspecified
4th rowNone or Unspecified
5th rowNone or Unspecified

Common Values

ValueCountFrequency (%)
None or Unspecified 190449
46.1%
Manual 23918
 
5.8%
Hydraulic 6312
 
1.5%
(Missing) 192019
46.5%

Length

2023-02-19T19:06:43.381037image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-02-19T19:06:43.638893image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
ValueCountFrequency (%)
none 190449
31.7%
or 190449
31.7%
unspecified 190449
31.7%
manual 23918
 
4.0%
hydraulic 6312
 
1.0%

Most occurring characters

ValueCountFrequency (%)
e 571347
15.0%
n 404816
10.6%
i 387210
10.1%
o 380898
10.0%
380898
10.0%
c 196761
 
5.2%
r 196761
 
5.2%
d 196761
 
5.2%
f 190449
 
5.0%
N 190449
 
5.0%
Other values (9) 722497
18.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 3026821
79.3%
Uppercase Letter 411128
 
10.8%
Space Separator 380898
 
10.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 571347
18.9%
n 404816
13.4%
i 387210
12.8%
o 380898
12.6%
c 196761
 
6.5%
r 196761
 
6.5%
d 196761
 
6.5%
f 190449
 
6.3%
p 190449
 
6.3%
s 190449
 
6.3%
Other values (4) 120920
 
4.0%
Uppercase Letter
ValueCountFrequency (%)
N 190449
46.3%
U 190449
46.3%
M 23918
 
5.8%
H 6312
 
1.5%
Space Separator
ValueCountFrequency (%)
380898
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 3437949
90.0%
Common 380898
 
10.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 571347
16.6%
n 404816
11.8%
i 387210
11.3%
o 380898
11.1%
c 196761
 
5.7%
r 196761
 
5.7%
d 196761
 
5.7%
f 190449
 
5.5%
N 190449
 
5.5%
p 190449
 
5.5%
Other values (8) 532048
15.5%
Common
ValueCountFrequency (%)
380898
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3818847
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 571347
15.0%
n 404816
10.6%
i 387210
10.1%
o 380898
10.0%
380898
10.0%
c 196761
 
5.2%
r 196761
 
5.2%
d 196761
 
5.2%
f 190449
 
5.0%
N 190449
 
5.0%
Other values (9) 722497
18.9%

Coupler_System
Categorical

IMBALANCE  MISSING 

Distinct2
Distinct (%)< 0.1%
Missing367724
Missing (%)89.1%
Memory size3.1 MiB
None or Unspecified
41727 
Yes
 
3247

Length

Max length19
Median length19
Mean length17.844844
Min length3

Characters and Unicode

Total characters802554
Distinct characters14
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNone or Unspecified
2nd rowNone or Unspecified
3rd rowNone or Unspecified
4th rowNone or Unspecified
5th rowNone or Unspecified

Common Values

ValueCountFrequency (%)
None or Unspecified 41727
 
10.1%
Yes 3247
 
0.8%
(Missing) 367724
89.1%

Length

2023-02-19T19:06:43.855765image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-02-19T19:06:44.096121image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
ValueCountFrequency (%)
none 41727
32.5%
or 41727
32.5%
unspecified 41727
32.5%
yes 3247
 
2.5%

Most occurring characters

ValueCountFrequency (%)
e 128428
16.0%
o 83454
10.4%
n 83454
10.4%
83454
10.4%
i 83454
10.4%
s 44974
 
5.6%
N 41727
 
5.2%
r 41727
 
5.2%
U 41727
 
5.2%
p 41727
 
5.2%
Other values (4) 128428
16.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 632399
78.8%
Uppercase Letter 86701
 
10.8%
Space Separator 83454
 
10.4%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 128428
20.3%
o 83454
13.2%
n 83454
13.2%
i 83454
13.2%
s 44974
 
7.1%
r 41727
 
6.6%
p 41727
 
6.6%
c 41727
 
6.6%
f 41727
 
6.6%
d 41727
 
6.6%
Uppercase Letter
ValueCountFrequency (%)
N 41727
48.1%
U 41727
48.1%
Y 3247
 
3.7%
Space Separator
ValueCountFrequency (%)
83454
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 719100
89.6%
Common 83454
 
10.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 128428
17.9%
o 83454
11.6%
n 83454
11.6%
i 83454
11.6%
s 44974
 
6.3%
N 41727
 
5.8%
r 41727
 
5.8%
U 41727
 
5.8%
p 41727
 
5.8%
c 41727
 
5.8%
Other values (3) 86701
12.1%
Common
ValueCountFrequency (%)
83454
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 802554
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 128428
16.0%
o 83454
10.4%
n 83454
10.4%
83454
10.4%
i 83454
10.4%
s 44974
 
5.6%
N 41727
 
5.2%
r 41727
 
5.2%
U 41727
 
5.2%
p 41727
 
5.2%
Other values (4) 128428
16.0%

Grouser_Tracks
Categorical

IMBALANCE  MISSING 

Distinct2
Distinct (%)< 0.1%
Missing367823
Missing (%)89.1%
Memory size3.1 MiB
None or Unspecified
41820 
Yes
 
3055

Length

Max length19
Median length19
Mean length17.910752
Min length3

Characters and Unicode

Total characters803745
Distinct characters14
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNone or Unspecified
2nd rowNone or Unspecified
3rd rowNone or Unspecified
4th rowYes
5th rowYes

Common Values

ValueCountFrequency (%)
None or Unspecified 41820
 
10.1%
Yes 3055
 
0.7%
(Missing) 367823
89.1%

Length

2023-02-19T19:06:44.294007image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-02-19T19:06:44.537407image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
ValueCountFrequency (%)
none 41820
32.5%
or 41820
32.5%
unspecified 41820
32.5%
yes 3055
 
2.4%

Most occurring characters

ValueCountFrequency (%)
e 128515
16.0%
o 83640
10.4%
n 83640
10.4%
83640
10.4%
i 83640
10.4%
s 44875
 
5.6%
N 41820
 
5.2%
r 41820
 
5.2%
U 41820
 
5.2%
p 41820
 
5.2%
Other values (4) 128515
16.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 633410
78.8%
Uppercase Letter 86695
 
10.8%
Space Separator 83640
 
10.4%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 128515
20.3%
o 83640
13.2%
n 83640
13.2%
i 83640
13.2%
s 44875
 
7.1%
r 41820
 
6.6%
p 41820
 
6.6%
c 41820
 
6.6%
f 41820
 
6.6%
d 41820
 
6.6%
Uppercase Letter
ValueCountFrequency (%)
N 41820
48.2%
U 41820
48.2%
Y 3055
 
3.5%
Space Separator
ValueCountFrequency (%)
83640
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 720105
89.6%
Common 83640
 
10.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 128515
17.8%
o 83640
11.6%
n 83640
11.6%
i 83640
11.6%
s 44875
 
6.2%
N 41820
 
5.8%
r 41820
 
5.8%
U 41820
 
5.8%
p 41820
 
5.8%
c 41820
 
5.8%
Other values (3) 86695
12.0%
Common
ValueCountFrequency (%)
83640
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 803745
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 128515
16.0%
o 83640
10.4%
n 83640
10.4%
83640
10.4%
i 83640
10.4%
s 44875
 
5.6%
N 41820
 
5.2%
r 41820
 
5.2%
U 41820
 
5.2%
p 41820
 
5.2%
Other values (4) 128515
16.0%

Hydraulics_Flow
Categorical

IMBALANCE  MISSING 

Distinct3
Distinct (%)< 0.1%
Missing367823
Missing (%)89.1%
Memory size3.1 MiB
Standard
44251 
High Flow
 
597
None or Unspecified
 
27

Length

Max length19
Median length8
Mean length8.019922
Min length8

Characters and Unicode

Total characters359894
Distinct characters22
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowStandard
2nd rowStandard
3rd rowStandard
4th rowStandard
5th rowStandard

Common Values

ValueCountFrequency (%)
Standard 44251
 
10.7%
High Flow 597
 
0.1%
None or Unspecified 27
 
< 0.1%
(Missing) 367823
89.1%

Length

2023-02-19T19:06:44.730288image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-02-19T19:06:44.966255image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
ValueCountFrequency (%)
standard 44251
97.2%
high 597
 
1.3%
flow 597
 
1.3%
none 27
 
0.1%
or 27
 
0.1%
unspecified 27
 
0.1%

Most occurring characters

ValueCountFrequency (%)
d 88529
24.6%
a 88502
24.6%
n 44305
12.3%
r 44278
12.3%
S 44251
12.3%
t 44251
12.3%
i 651
 
0.2%
o 651
 
0.2%
651
 
0.2%
w 597
 
0.2%
Other values (12) 3228
 
0.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 313744
87.2%
Uppercase Letter 45499
 
12.6%
Space Separator 651
 
0.2%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
d 88529
28.2%
a 88502
28.2%
n 44305
14.1%
r 44278
14.1%
t 44251
14.1%
i 651
 
0.2%
o 651
 
0.2%
w 597
 
0.2%
l 597
 
0.2%
h 597
 
0.2%
Other values (6) 786
 
0.3%
Uppercase Letter
ValueCountFrequency (%)
S 44251
97.3%
F 597
 
1.3%
H 597
 
1.3%
N 27
 
0.1%
U 27
 
0.1%
Space Separator
ValueCountFrequency (%)
651
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 359243
99.8%
Common 651
 
0.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
d 88529
24.6%
a 88502
24.6%
n 44305
12.3%
r 44278
12.3%
S 44251
12.3%
t 44251
12.3%
i 651
 
0.2%
o 651
 
0.2%
w 597
 
0.2%
l 597
 
0.2%
Other values (11) 2631
 
0.7%
Common
ValueCountFrequency (%)
651
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 359894
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
d 88529
24.6%
a 88502
24.6%
n 44305
12.3%
r 44278
12.3%
S 44251
12.3%
t 44251
12.3%
i 651
 
0.2%
o 651
 
0.2%
651
 
0.2%
w 597
 
0.2%
Other values (12) 3228
 
0.9%

Track_Type
Categorical

Distinct2
Distinct (%)< 0.1%
Missing310505
Missing (%)75.2%
Memory size3.1 MiB
Steel
87463 
Rubber
14730 

Length

Max length6
Median length5
Mean length5.144139
Min length5

Characters and Unicode

Total characters525695
Distinct characters8
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowSteel
2nd rowRubber
3rd rowSteel
4th rowRubber
5th rowSteel

Common Values

ValueCountFrequency (%)
Steel 87463
 
21.2%
Rubber 14730
 
3.6%
(Missing) 310505
75.2%

Length

2023-02-19T19:06:45.168936image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-02-19T19:06:45.382377image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
ValueCountFrequency (%)
steel 87463
85.6%
rubber 14730
 
14.4%

Most occurring characters

ValueCountFrequency (%)
e 189656
36.1%
S 87463
16.6%
t 87463
16.6%
l 87463
16.6%
b 29460
 
5.6%
R 14730
 
2.8%
u 14730
 
2.8%
r 14730
 
2.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 423502
80.6%
Uppercase Letter 102193
 
19.4%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 189656
44.8%
t 87463
20.7%
l 87463
20.7%
b 29460
 
7.0%
u 14730
 
3.5%
r 14730
 
3.5%
Uppercase Letter
ValueCountFrequency (%)
S 87463
85.6%
R 14730
 
14.4%

Most occurring scripts

ValueCountFrequency (%)
Latin 525695
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 189656
36.1%
S 87463
16.6%
t 87463
16.6%
l 87463
16.6%
b 29460
 
5.6%
R 14730
 
2.8%
u 14730
 
2.8%
r 14730
 
2.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 525695
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 189656
36.1%
S 87463
16.6%
t 87463
16.6%
l 87463
16.6%
b 29460
 
5.6%
R 14730
 
2.8%
u 14730
 
2.8%
r 14730
 
2.8%

Undercarriage_Pad_Width
Categorical

IMBALANCE  MISSING 

Distinct19
Distinct (%)< 0.1%
Missing309782
Missing (%)75.1%
Memory size3.1 MiB
None or Unspecified
82444 
32 inch
 
5287
28 inch
 
3152
24 inch
 
2998
20 inch
 
2664
Other values (14)
 
6371

Length

Max length19
Median length19
Mean length16.613005
Min length7

Characters and Unicode

Total characters1709744
Distinct characters24
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNone or Unspecified
2nd rowNone or Unspecified
3rd rowNone or Unspecified
4th rowNone or Unspecified
5th row16 inch

Common Values

ValueCountFrequency (%)
None or Unspecified 82444
 
20.0%
32 inch 5287
 
1.3%
28 inch 3152
 
0.8%
24 inch 2998
 
0.7%
20 inch 2664
 
0.6%
30 inch 1602
 
0.4%
36 inch 1544
 
0.4%
18 inch 1439
 
0.3%
34 inch 540
 
0.1%
16 inch 481
 
0.1%
Other values (9) 765
 
0.2%
(Missing) 309782
75.1%

Length

2023-02-19T19:06:45.562794image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
none 82444
28.6%
or 82444
28.6%
unspecified 82444
28.6%
inch 20472
 
7.1%
32 5287
 
1.8%
28 3152
 
1.1%
24 2998
 
1.0%
20 2664
 
0.9%
30 1602
 
0.6%
36 1544
 
0.5%
Other values (12) 3225
 
1.1%

Most occurring characters

ValueCountFrequency (%)
e 247332
14.5%
n 185360
10.8%
185360
10.8%
i 185360
10.8%
o 164888
9.6%
c 102916
 
6.0%
N 82444
 
4.8%
f 82444
 
4.8%
d 82444
 
4.8%
p 82444
 
4.8%
Other values (14) 308752
18.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1318548
77.1%
Space Separator 185360
 
10.8%
Uppercase Letter 164888
 
9.6%
Decimal Number 40946
 
2.4%
Other Punctuation 2
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 247332
18.8%
n 185360
14.1%
i 185360
14.1%
o 164888
12.5%
c 102916
7.8%
f 82444
 
6.3%
d 82444
 
6.3%
p 82444
 
6.3%
s 82444
 
6.3%
r 82444
 
6.3%
Decimal Number
ValueCountFrequency (%)
2 14630
35.7%
3 9354
22.8%
8 4591
 
11.2%
0 4266
 
10.4%
4 3589
 
8.8%
1 2197
 
5.4%
6 2123
 
5.2%
7 144
 
0.4%
5 52
 
0.1%
Uppercase Letter
ValueCountFrequency (%)
N 82444
50.0%
U 82444
50.0%
Space Separator
ValueCountFrequency (%)
185360
100.0%
Other Punctuation
ValueCountFrequency (%)
. 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1483436
86.8%
Common 226308
 
13.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 247332
16.7%
n 185360
12.5%
i 185360
12.5%
o 164888
11.1%
c 102916
6.9%
N 82444
 
5.6%
f 82444
 
5.6%
d 82444
 
5.6%
p 82444
 
5.6%
s 82444
 
5.6%
Other values (3) 185360
12.5%
Common
ValueCountFrequency (%)
185360
81.9%
2 14630
 
6.5%
3 9354
 
4.1%
8 4591
 
2.0%
0 4266
 
1.9%
4 3589
 
1.6%
1 2197
 
1.0%
6 2123
 
0.9%
7 144
 
0.1%
5 52
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1709744
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 247332
14.5%
n 185360
10.8%
185360
10.8%
i 185360
10.8%
o 164888
9.6%
c 102916
 
6.0%
N 82444
 
4.8%
f 82444
 
4.8%
d 82444
 
4.8%
p 82444
 
4.8%
Other values (14) 308752
18.1%

Stick_Length
Categorical

IMBALANCE  MISSING 

Distinct29
Distinct (%)< 0.1%
Missing310437
Missing (%)75.2%
Memory size3.1 MiB
None or Unspecified
81539 
9' 6"
 
5832
10' 6"
 
3519
11' 0"
 
1601
9' 10"
 
1463
Other values (24)
8307 

Length

Max length19
Median length19
Mean length16.279148
Min length5

Characters and Unicode

Total characters1664722
Distinct characters25
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowNone or Unspecified
2nd rowNone or Unspecified
3rd row11' 0"
4th rowNone or Unspecified
5th rowNone or Unspecified

Common Values

ValueCountFrequency (%)
None or Unspecified 81539
 
19.8%
9' 6" 5832
 
1.4%
10' 6" 3519
 
0.9%
11' 0" 1601
 
0.4%
9' 10" 1463
 
0.4%
9' 8" 1462
 
0.4%
9' 7" 1423
 
0.3%
12' 10" 1087
 
0.3%
10' 2" 1004
 
0.2%
8' 6" 908
 
0.2%
Other values (19) 2423
 
0.6%
(Missing) 310437
75.2%

Length

2023-02-19T19:06:45.772664image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
none 81539
28.5%
or 81539
28.5%
unspecified 81539
28.5%
9 10376
 
3.6%
6 10310
 
3.6%
10 8322
 
2.9%
8 3689
 
1.3%
11 1908
 
0.7%
2 1619
 
0.6%
0 1601
 
0.6%
Other values (11) 3619
 
1.3%

Most occurring characters

ValueCountFrequency (%)
e 244617
14.7%
183800
11.0%
n 163078
9.8%
o 163078
9.8%
i 163078
9.8%
N 81539
 
4.9%
c 81539
 
4.9%
f 81539
 
4.9%
d 81539
 
4.9%
p 81539
 
4.9%
Other values (15) 339376
20.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1223085
73.5%
Space Separator 183800
 
11.0%
Uppercase Letter 163078
 
9.8%
Decimal Number 53315
 
3.2%
Other Punctuation 41444
 
2.5%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 244617
20.0%
n 163078
13.3%
o 163078
13.3%
i 163078
13.3%
c 81539
 
6.7%
f 81539
 
6.7%
d 81539
 
6.7%
p 81539
 
6.7%
s 81539
 
6.7%
r 81539
 
6.7%
Decimal Number
ValueCountFrequency (%)
1 13784
25.9%
9 10381
19.5%
6 10310
19.3%
0 9923
18.6%
8 3689
 
6.9%
2 3133
 
5.9%
7 1437
 
2.7%
4 389
 
0.7%
5 191
 
0.4%
3 78
 
0.1%
Uppercase Letter
ValueCountFrequency (%)
N 81539
50.0%
U 81539
50.0%
Other Punctuation
ValueCountFrequency (%)
' 20722
50.0%
" 20722
50.0%
Space Separator
ValueCountFrequency (%)
183800
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1386163
83.3%
Common 278559
 
16.7%

Most frequent character per script

Common
ValueCountFrequency (%)
183800
66.0%
' 20722
 
7.4%
" 20722
 
7.4%
1 13784
 
4.9%
9 10381
 
3.7%
6 10310
 
3.7%
0 9923
 
3.6%
8 3689
 
1.3%
2 3133
 
1.1%
7 1437
 
0.5%
Other values (3) 658
 
0.2%
Latin
ValueCountFrequency (%)
e 244617
17.6%
n 163078
11.8%
o 163078
11.8%
i 163078
11.8%
N 81539
 
5.9%
c 81539
 
5.9%
f 81539
 
5.9%
d 81539
 
5.9%
p 81539
 
5.9%
s 81539
 
5.9%
Other values (2) 163078
11.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1664722
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 244617
14.7%
183800
11.0%
n 163078
9.8%
o 163078
9.8%
i 163078
9.8%
N 81539
 
4.9%
c 81539
 
4.9%
f 81539
 
4.9%
d 81539
 
4.9%
p 81539
 
4.9%
Other values (15) 339376
20.4%

Thumb
Categorical

Distinct3
Distinct (%)< 0.1%
Missing310366
Missing (%)75.2%
Memory size3.1 MiB
None or Unspecified
85074 
Manual
9678 
Hydraulic
 
7580

Length

Max length19
Median length19
Mean length17.029805
Min length6

Characters and Unicode

Total characters1742694
Distinct characters19
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNone or Unspecified
2nd rowNone or Unspecified
3rd rowNone or Unspecified
4th rowNone or Unspecified
5th rowNone or Unspecified

Common Values

ValueCountFrequency (%)
None or Unspecified 85074
 
20.6%
Manual 9678
 
2.3%
Hydraulic 7580
 
1.8%
(Missing) 310366
75.2%

Length

2023-02-19T19:06:45.991058image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-02-19T19:06:46.214919image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
ValueCountFrequency (%)
none 85074
31.2%
or 85074
31.2%
unspecified 85074
31.2%
manual 9678
 
3.6%
hydraulic 7580
 
2.8%

Most occurring characters

ValueCountFrequency (%)
e 255222
14.6%
n 179826
10.3%
i 177728
10.2%
o 170148
9.8%
170148
9.8%
c 92654
 
5.3%
r 92654
 
5.3%
d 92654
 
5.3%
f 85074
 
4.9%
N 85074
 
4.9%
Other values (9) 341512
19.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1385140
79.5%
Uppercase Letter 187406
 
10.8%
Space Separator 170148
 
9.8%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 255222
18.4%
n 179826
13.0%
i 177728
12.8%
o 170148
12.3%
c 92654
 
6.7%
r 92654
 
6.7%
d 92654
 
6.7%
f 85074
 
6.1%
p 85074
 
6.1%
s 85074
 
6.1%
Other values (4) 69032
 
5.0%
Uppercase Letter
ValueCountFrequency (%)
N 85074
45.4%
U 85074
45.4%
M 9678
 
5.2%
H 7580
 
4.0%
Space Separator
ValueCountFrequency (%)
170148
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1572546
90.2%
Common 170148
 
9.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 255222
16.2%
n 179826
11.4%
i 177728
11.3%
o 170148
10.8%
c 92654
 
5.9%
r 92654
 
5.9%
d 92654
 
5.9%
f 85074
 
5.4%
N 85074
 
5.4%
p 85074
 
5.4%
Other values (8) 256438
16.3%
Common
ValueCountFrequency (%)
170148
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1742694
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 255222
14.6%
n 179826
10.3%
i 177728
10.2%
o 170148
9.8%
170148
9.8%
c 92654
 
5.3%
r 92654
 
5.3%
d 92654
 
5.3%
f 85074
 
4.9%
N 85074
 
4.9%
Other values (9) 341512
19.6%

Pattern_Changer
Categorical

IMBALANCE  MISSING 

Distinct3
Distinct (%)< 0.1%
Missing310437
Missing (%)75.2%
Memory size3.1 MiB
None or Unspecified
92924 
Yes
 
9269
No
 
68

Length

Max length19
Median length19
Mean length17.538446
Min length2

Characters and Unicode

Total characters1793499
Distinct characters14
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNone or Unspecified
2nd rowNone or Unspecified
3rd rowNone or Unspecified
4th rowNone or Unspecified
5th rowNone or Unspecified

Common Values

ValueCountFrequency (%)
None or Unspecified 92924
 
22.5%
Yes 9269
 
2.2%
No 68
 
< 0.1%
(Missing) 310437
75.2%

Length

2023-02-19T19:06:46.412812image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-02-19T19:06:46.642760image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
ValueCountFrequency (%)
none 92924
32.3%
or 92924
32.3%
unspecified 92924
32.3%
yes 9269
 
3.2%
no 68
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
e 288041
16.1%
o 185916
10.4%
n 185848
10.4%
185848
10.4%
i 185848
10.4%
s 102193
 
5.7%
N 92992
 
5.2%
r 92924
 
5.2%
U 92924
 
5.2%
p 92924
 
5.2%
Other values (4) 288041
16.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1412466
78.8%
Uppercase Letter 195185
 
10.9%
Space Separator 185848
 
10.4%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 288041
20.4%
o 185916
13.2%
n 185848
13.2%
i 185848
13.2%
s 102193
 
7.2%
r 92924
 
6.6%
p 92924
 
6.6%
c 92924
 
6.6%
f 92924
 
6.6%
d 92924
 
6.6%
Uppercase Letter
ValueCountFrequency (%)
N 92992
47.6%
U 92924
47.6%
Y 9269
 
4.7%
Space Separator
ValueCountFrequency (%)
185848
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1607651
89.6%
Common 185848
 
10.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 288041
17.9%
o 185916
11.6%
n 185848
11.6%
i 185848
11.6%
s 102193
 
6.4%
N 92992
 
5.8%
r 92924
 
5.8%
U 92924
 
5.8%
p 92924
 
5.8%
c 92924
 
5.8%
Other values (3) 195117
12.1%
Common
ValueCountFrequency (%)
185848
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1793499
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 288041
16.1%
o 185916
10.4%
n 185848
10.4%
185848
10.4%
i 185848
10.4%
s 102193
 
5.7%
N 92992
 
5.2%
r 92924
 
5.2%
U 92924
 
5.2%
p 92924
 
5.2%
Other values (4) 288041
16.1%

Grouser_Type
Categorical

IMBALANCE  MISSING 

Distinct3
Distinct (%)< 0.1%
Missing310505
Missing (%)75.2%
Memory size3.1 MiB
Double
86998 
Triple
15193 
Single
 
2

Length

Max length6
Median length6
Mean length6
Min length6

Characters and Unicode

Total characters613158
Distinct characters13
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowDouble
2nd rowDouble
3rd rowDouble
4th rowDouble
5th rowDouble

Common Values

ValueCountFrequency (%)
Double 86998
 
21.1%
Triple 15193
 
3.7%
Single 2
 
< 0.1%
(Missing) 310505
75.2%

Length

2023-02-19T19:06:46.825647image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-02-19T19:06:47.038519image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
ValueCountFrequency (%)
double 86998
85.1%
triple 15193
 
14.9%
single 2
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
l 102193
16.7%
e 102193
16.7%
D 86998
14.2%
o 86998
14.2%
u 86998
14.2%
b 86998
14.2%
i 15195
 
2.5%
T 15193
 
2.5%
r 15193
 
2.5%
p 15193
 
2.5%
Other values (3) 6
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 510965
83.3%
Uppercase Letter 102193
 
16.7%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
l 102193
20.0%
e 102193
20.0%
o 86998
17.0%
u 86998
17.0%
b 86998
17.0%
i 15195
 
3.0%
r 15193
 
3.0%
p 15193
 
3.0%
n 2
 
< 0.1%
g 2
 
< 0.1%
Uppercase Letter
ValueCountFrequency (%)
D 86998
85.1%
T 15193
 
14.9%
S 2
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Latin 613158
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
l 102193
16.7%
e 102193
16.7%
D 86998
14.2%
o 86998
14.2%
u 86998
14.2%
b 86998
14.2%
i 15195
 
2.5%
T 15193
 
2.5%
r 15193
 
2.5%
p 15193
 
2.5%
Other values (3) 6
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 613158
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
l 102193
16.7%
e 102193
16.7%
D 86998
14.2%
o 86998
14.2%
u 86998
14.2%
b 86998
14.2%
i 15195
 
2.5%
T 15193
 
2.5%
r 15193
 
2.5%
p 15193
 
2.5%
Other values (3) 6
 
< 0.1%

Backhoe_Mounting
Categorical

IMBALANCE  MISSING 

Distinct2
Distinct (%)< 0.1%
Missing331986
Missing (%)80.4%
Memory size3.1 MiB
None or Unspecified
80692 
Yes
 
20

Length

Max length19
Median length19
Mean length18.996035
Min length3

Characters and Unicode

Total characters1533208
Distinct characters14
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNone or Unspecified
2nd rowNone or Unspecified
3rd rowNone or Unspecified
4th rowNone or Unspecified
5th rowNone or Unspecified

Common Values

ValueCountFrequency (%)
None or Unspecified 80692
 
19.6%
Yes 20
 
< 0.1%
(Missing) 331986
80.4%

Length

2023-02-19T19:06:47.233394image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-02-19T19:06:47.486243image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
ValueCountFrequency (%)
none 80692
33.3%
or 80692
33.3%
unspecified 80692
33.3%
yes 20
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
e 242096
15.8%
o 161384
10.5%
n 161384
10.5%
161384
10.5%
i 161384
10.5%
s 80712
 
5.3%
N 80692
 
5.3%
r 80692
 
5.3%
U 80692
 
5.3%
p 80692
 
5.3%
Other values (4) 242096
15.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1210420
78.9%
Uppercase Letter 161404
 
10.5%
Space Separator 161384
 
10.5%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 242096
20.0%
o 161384
13.3%
n 161384
13.3%
i 161384
13.3%
s 80712
 
6.7%
r 80692
 
6.7%
p 80692
 
6.7%
c 80692
 
6.7%
f 80692
 
6.7%
d 80692
 
6.7%
Uppercase Letter
ValueCountFrequency (%)
N 80692
50.0%
U 80692
50.0%
Y 20
 
< 0.1%
Space Separator
ValueCountFrequency (%)
161384
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1371824
89.5%
Common 161384
 
10.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 242096
17.6%
o 161384
11.8%
n 161384
11.8%
i 161384
11.8%
s 80712
 
5.9%
N 80692
 
5.9%
r 80692
 
5.9%
U 80692
 
5.9%
p 80692
 
5.9%
c 80692
 
5.9%
Other values (3) 161404
11.8%
Common
ValueCountFrequency (%)
161384
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1533208
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 242096
15.8%
o 161384
10.5%
n 161384
10.5%
161384
10.5%
i 161384
10.5%
s 80712
 
5.3%
N 80692
 
5.3%
r 80692
 
5.3%
U 80692
 
5.3%
p 80692
 
5.3%
Other values (4) 242096
15.8%

Blade_Type
Categorical

Distinct10
Distinct (%)< 0.1%
Missing330823
Missing (%)80.2%
Memory size3.1 MiB
PAT
39633 
Straight
13461 
None or Unspecified
11841 
Semi U
8907 
VPAT
 
3681
Other values (5)
4352 

Length

Max length19
Median length8
Mean length6.4949985
Min length1

Characters and Unicode

Total characters531778
Distinct characters26
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowPAT
2nd rowNone or Unspecified
3rd rowNone or Unspecified
4th rowNone or Unspecified
5th rowNone or Unspecified

Common Values

ValueCountFrequency (%)
PAT 39633
 
9.6%
Straight 13461
 
3.3%
None or Unspecified 11841
 
2.9%
Semi U 8907
 
2.2%
VPAT 3681
 
0.9%
U 1888
 
0.5%
Angle 1684
 
0.4%
No 743
 
0.2%
Landfill 26
 
< 0.1%
Coal 11
 
< 0.1%
(Missing) 330823
80.2%

Length

2023-02-19T19:06:47.696948image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-02-19T19:06:48.398568image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
ValueCountFrequency (%)
pat 39633
34.6%
straight 13461
 
11.8%
none 11841
 
10.3%
or 11841
 
10.3%
unspecified 11841
 
10.3%
u 10795
 
9.4%
semi 8907
 
7.8%
vpat 3681
 
3.2%
angle 1684
 
1.5%
no 743
 
0.6%
Other values (2) 37
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
e 46114
 
8.7%
i 46076
 
8.7%
A 44998
 
8.5%
P 43314
 
8.1%
T 43314
 
8.1%
32589
 
6.1%
t 26922
 
5.1%
n 25392
 
4.8%
r 25302
 
4.8%
o 24436
 
4.6%
Other values (16) 173321
32.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 306257
57.6%
Uppercase Letter 192932
36.3%
Space Separator 32589
 
6.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 46114
15.1%
i 46076
15.0%
t 26922
8.8%
n 25392
8.3%
r 25302
8.3%
o 24436
8.0%
g 15145
 
4.9%
a 13498
 
4.4%
h 13461
 
4.4%
f 11867
 
3.9%
Other values (6) 58044
19.0%
Uppercase Letter
ValueCountFrequency (%)
A 44998
23.3%
P 43314
22.5%
T 43314
22.5%
U 22636
11.7%
S 22368
11.6%
N 12584
 
6.5%
V 3681
 
1.9%
L 26
 
< 0.1%
C 11
 
< 0.1%
Space Separator
ValueCountFrequency (%)
32589
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 499189
93.9%
Common 32589
 
6.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 46114
 
9.2%
i 46076
 
9.2%
A 44998
 
9.0%
P 43314
 
8.7%
T 43314
 
8.7%
t 26922
 
5.4%
n 25392
 
5.1%
r 25302
 
5.1%
o 24436
 
4.9%
U 22636
 
4.5%
Other values (15) 150685
30.2%
Common
ValueCountFrequency (%)
32589
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 531778
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 46114
 
8.7%
i 46076
 
8.7%
A 44998
 
8.5%
P 43314
 
8.1%
T 43314
 
8.1%
32589
 
6.1%
t 26922
 
5.1%
n 25392
 
4.8%
r 25302
 
4.8%
o 24436
 
4.6%
Other values (16) 173321
32.6%

Travel_Controls
Categorical

IMBALANCE  MISSING 

Distinct7
Distinct (%)< 0.1%
Missing330821
Missing (%)80.2%
Memory size3.1 MiB
None or Unspecified
71447 
Differential Steer
 
5257
Finger Tip
 
2693
2 Pedal
 
1144
Lever
 
902
Other values (2)
 
434

Length

Max length19
Median length19
Mean length18.243939
Min length5

Characters and Unicode

Total characters1493759
Distinct characters26
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNone or Unspecified
2nd rowNone or Unspecified
3rd rowNone or Unspecified
4th rowNone or Unspecified
5th rowNone or Unspecified

Common Values

ValueCountFrequency (%)
None or Unspecified 71447
 
17.3%
Differential Steer 5257
 
1.3%
Finger Tip 2693
 
0.7%
2 Pedal 1144
 
0.3%
Lever 902
 
0.2%
Pedal 423
 
0.1%
1 Speed 11
 
< 0.1%
(Missing) 330821
80.2%

Length

2023-02-19T19:06:48.671399image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-02-19T19:06:48.935235image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
ValueCountFrequency (%)
none 71447
30.5%
or 71447
30.5%
unspecified 71447
30.5%
differential 5257
 
2.2%
steer 5257
 
2.2%
finger 2693
 
1.2%
tip 2693
 
1.2%
pedal 1567
 
0.7%
2 1144
 
0.5%
lever 902
 
0.4%
Other values (2) 22
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
e 241455
16.2%
i 158794
10.6%
151999
10.2%
n 150844
10.1%
o 142894
9.6%
r 85556
 
5.7%
f 81961
 
5.5%
p 74151
 
5.0%
d 73025
 
4.9%
N 71447
 
4.8%
Other values (16) 261633
17.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1179331
79.0%
Uppercase Letter 161274
 
10.8%
Space Separator 151999
 
10.2%
Decimal Number 1155
 
0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 241455
20.5%
i 158794
13.5%
n 150844
12.8%
o 142894
12.1%
r 85556
 
7.3%
f 81961
 
6.9%
p 74151
 
6.3%
d 73025
 
6.2%
c 71447
 
6.1%
s 71447
 
6.1%
Other values (5) 27757
 
2.4%
Uppercase Letter
ValueCountFrequency (%)
N 71447
44.3%
U 71447
44.3%
S 5268
 
3.3%
D 5257
 
3.3%
F 2693
 
1.7%
T 2693
 
1.7%
P 1567
 
1.0%
L 902
 
0.6%
Decimal Number
ValueCountFrequency (%)
2 1144
99.0%
1 11
 
1.0%
Space Separator
ValueCountFrequency (%)
151999
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1340605
89.7%
Common 153154
 
10.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 241455
18.0%
i 158794
11.8%
n 150844
11.3%
o 142894
10.7%
r 85556
 
6.4%
f 81961
 
6.1%
p 74151
 
5.5%
d 73025
 
5.4%
N 71447
 
5.3%
c 71447
 
5.3%
Other values (13) 189031
14.1%
Common
ValueCountFrequency (%)
151999
99.2%
2 1144
 
0.7%
1 11
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1493759
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 241455
16.2%
i 158794
10.6%
151999
10.2%
n 150844
10.1%
o 142894
9.6%
r 85556
 
5.7%
f 81961
 
5.5%
p 74151
 
5.0%
d 73025
 
4.9%
N 71447
 
4.8%
Other values (16) 261633
17.5%

Differential_Type
Categorical

IMBALANCE  MISSING 

Distinct4
Distinct (%)< 0.1%
Missing341134
Missing (%)82.7%
Memory size3.1 MiB
Standard
70169 
Limited Slip
 
1181
No Spin
 
212
Locking
 
2

Length

Max length12
Median length8
Mean length8.0630205
Min length7

Characters and Unicode

Total characters577022
Distinct characters18
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowStandard
2nd rowStandard
3rd rowStandard
4th rowStandard
5th rowStandard

Common Values

ValueCountFrequency (%)
Standard 70169
 
17.0%
Limited Slip 1181
 
0.3%
No Spin 212
 
0.1%
Locking 2
 
< 0.1%
(Missing) 341134
82.7%

Length

2023-02-19T19:06:49.174087image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-02-19T19:06:49.413938image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
ValueCountFrequency (%)
standard 70169
96.2%
limited 1181
 
1.6%
slip 1181
 
1.6%
no 212
 
0.3%
spin 212
 
0.3%
locking 2
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
d 141519
24.5%
a 140338
24.3%
S 71562
12.4%
t 71350
12.4%
n 70383
12.2%
r 70169
12.2%
i 3757
 
0.7%
p 1393
 
0.2%
1393
 
0.2%
L 1183
 
0.2%
Other values (8) 3975
 
0.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 502672
87.1%
Uppercase Letter 72957
 
12.6%
Space Separator 1393
 
0.2%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
d 141519
28.2%
a 140338
27.9%
t 71350
14.2%
n 70383
14.0%
r 70169
14.0%
i 3757
 
0.7%
p 1393
 
0.3%
l 1181
 
0.2%
e 1181
 
0.2%
m 1181
 
0.2%
Other values (4) 220
 
< 0.1%
Uppercase Letter
ValueCountFrequency (%)
S 71562
98.1%
L 1183
 
1.6%
N 212
 
0.3%
Space Separator
ValueCountFrequency (%)
1393
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 575629
99.8%
Common 1393
 
0.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
d 141519
24.6%
a 140338
24.4%
S 71562
12.4%
t 71350
12.4%
n 70383
12.2%
r 70169
12.2%
i 3757
 
0.7%
p 1393
 
0.2%
L 1183
 
0.2%
l 1181
 
0.2%
Other values (7) 2794
 
0.5%
Common
ValueCountFrequency (%)
1393
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 577022
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
d 141519
24.5%
a 140338
24.3%
S 71562
12.4%
t 71350
12.4%
n 70383
12.2%
r 70169
12.2%
i 3757
 
0.7%
p 1393
 
0.2%
1393
 
0.2%
L 1183
 
0.2%
Other values (8) 3975
 
0.7%

Steering_Controls
Categorical

IMBALANCE  MISSING 

Distinct5
Distinct (%)< 0.1%
Missing341176
Missing (%)82.7%
Memory size3.1 MiB
Conventional
70774 
Command Control
 
594
Four Wheel Standard
 
139
Wheel
 
14
No
 
1

Length

Max length19
Median length12
Mean length12.03701
Min length2

Characters and Unicode

Total characters860911
Distinct characters19
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowConventional
2nd rowConventional
3rd rowConventional
4th rowConventional
5th rowConventional

Common Values

ValueCountFrequency (%)
Conventional 70774
 
17.1%
Command Control 594
 
0.1%
Four Wheel Standard 139
 
< 0.1%
Wheel 14
 
< 0.1%
No 1
 
< 0.1%
(Missing) 341176
82.7%

Length

2023-02-19T19:06:49.626806image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-02-19T19:06:49.881647image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
ValueCountFrequency (%)
conventional 70774
97.8%
command 594
 
0.8%
control 594
 
0.8%
wheel 153
 
0.2%
four 139
 
0.2%
standard 139
 
0.2%
no 1
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
n 213649
24.8%
o 143470
16.7%
C 71962
 
8.4%
a 71646
 
8.3%
l 71521
 
8.3%
t 71507
 
8.3%
e 71080
 
8.3%
v 70774
 
8.2%
i 70774
 
8.2%
m 1188
 
0.1%
Other values (9) 3340
 
0.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 787645
91.5%
Uppercase Letter 72394
 
8.4%
Space Separator 872
 
0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
n 213649
27.1%
o 143470
18.2%
a 71646
 
9.1%
l 71521
 
9.1%
t 71507
 
9.1%
e 71080
 
9.0%
v 70774
 
9.0%
i 70774
 
9.0%
m 1188
 
0.2%
d 872
 
0.1%
Other values (3) 1164
 
0.1%
Uppercase Letter
ValueCountFrequency (%)
C 71962
99.4%
W 153
 
0.2%
F 139
 
0.2%
S 139
 
0.2%
N 1
 
< 0.1%
Space Separator
ValueCountFrequency (%)
872
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 860039
99.9%
Common 872
 
0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
n 213649
24.8%
o 143470
16.7%
C 71962
 
8.4%
a 71646
 
8.3%
l 71521
 
8.3%
t 71507
 
8.3%
e 71080
 
8.3%
v 70774
 
8.2%
i 70774
 
8.2%
m 1188
 
0.1%
Other values (8) 2468
 
0.3%
Common
ValueCountFrequency (%)
872
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 860911
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
n 213649
24.8%
o 143470
16.7%
C 71962
 
8.4%
a 71646
 
8.3%
l 71521
 
8.3%
t 71507
 
8.3%
e 71080
 
8.3%
v 70774
 
8.2%
i 70774
 
8.2%
m 1188
 
0.1%
Other values (9) 3340
 
0.4%

Interactions

2023-02-19T19:05:59.945622image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-02-19T19:05:43.139824image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-02-19T19:05:45.506888image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-02-19T19:05:47.965344image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-02-19T19:05:50.628008image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-02-19T19:05:53.012299image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-02-19T19:05:55.345489image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-02-19T19:05:57.675181image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-02-19T19:06:00.198506image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-02-19T19:05:43.462265image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-02-19T19:05:45.806655image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-02-19T19:05:48.259583image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-02-19T19:05:50.923943image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-02-19T19:05:53.300133image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-02-19T19:05:55.636890image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-02-19T19:05:57.963098image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-02-19T19:06:00.471574image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-02-19T19:05:43.774215image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-02-19T19:05:46.132159image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-02-19T19:05:48.569570image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-02-19T19:05:51.239920image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-02-19T19:05:53.613441image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-02-19T19:05:55.950700image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-02-19T19:05:58.273469image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-02-19T19:06:00.720421image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-02-19T19:05:44.069216image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-02-19T19:05:46.440624image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-02-19T19:05:48.867287image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-02-19T19:05:51.536045image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-02-19T19:05:53.913575image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-02-19T19:05:56.252519image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-02-19T19:05:58.561291image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-02-19T19:06:00.994263image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-02-19T19:05:44.379659image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-02-19T19:05:46.773096image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-02-19T19:05:49.182536image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-02-19T19:05:51.846904image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-02-19T19:05:54.214891image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-02-19T19:05:56.561359image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-02-19T19:05:58.866119image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-02-19T19:06:01.235129image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-02-19T19:05:44.673520image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-02-19T19:05:47.088911image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-02-19T19:05:49.482196image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-02-19T19:05:52.147728image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-02-19T19:05:54.504802image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-02-19T19:05:56.847705image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-02-19T19:05:59.161992image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-02-19T19:06:01.484976image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-02-19T19:05:44.960716image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-02-19T19:05:47.394125image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-02-19T19:05:49.779683image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-02-19T19:05:52.444392image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-02-19T19:05:54.795056image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-02-19T19:05:57.136543image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-02-19T19:05:59.453892image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-02-19T19:06:01.770797image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-02-19T19:05:45.211576image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-02-19T19:05:47.654162image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-02-19T19:05:50.032527image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-02-19T19:05:52.702157image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-02-19T19:05:55.041650image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-02-19T19:05:57.376565image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-02-19T19:05:59.695255image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Missing values

2023-02-19T19:06:05.430600image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-02-19T19:06:10.525717image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-02-19T19:06:23.080930image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

SalesIDSalePriceMachineIDModelIDdatasourceauctioneerIDYearMadeMachineHoursCurrentMeterUsageBandsaledatefiModelDescfiBaseModelfiSecondaryDescfiModelSeriesfiModelDescriptorProductSizefiProductClassDescstateProductGroupProductGroupDescDrive_SystemEnclosureForksPad_TypeRide_ControlStickTransmissionTurbochargedBlade_ExtensionBlade_WidthEnclosure_TypeEngine_HorsepowerHydraulicsPushblockRipperScarifierTip_ControlTire_SizeCouplerCoupler_SystemGrouser_TracksHydraulics_FlowTrack_TypeUndercarriage_Pad_WidthStick_LengthThumbPattern_ChangerGrouser_TypeBackhoe_MountingBlade_TypeTravel_ControlsDifferential_TypeSteering_Controls
0113924666000.099908931571213.0200468.0Low11/16/2006 0:00521D521DNaNNaNNaNWheel Loader - 110.0 to 120.0 HorsepowerAlabamaWLWheel LoaderNaNEROPS w ACNone or UnspecifiedNaNNone or UnspecifiedNaNNaNNaNNaNNaNNaNNaN2 ValveNaNNaNNaNNaNNone or UnspecifiedNone or UnspecifiedNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNStandardConventional
1113924857000.0117657771213.019964640.0Low3/26/2004 0:00950FII950FIINaNMediumWheel Loader - 150.0 to 175.0 HorsepowerNorth CarolinaWLWheel LoaderNaNEROPS w ACNone or UnspecifiedNaNNone or UnspecifiedNaNNaNNaNNaNNaNNaNNaN2 ValveNaNNaNNaNNaN23.5None or UnspecifiedNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNStandardConventional
2113924910000.043480870091213.020012838.0High2/26/2004 0:00226226NaNNaNNaNNaNSkid Steer Loader - 1351.0 to 1601.0 Lb Operating CapacityNew YorkSSLSkid Steer LoadersNaNOROPSNone or UnspecifiedNaNNaNNaNNaNNaNNaNNaNNaNNaNAuxiliaryNaNNaNNaNNaNNaNNone or UnspecifiedNone or UnspecifiedNone or UnspecifiedStandardNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
3113925138500.010264703321213.020013486.0High5/19/2011 0:00PC120-6EPC120NaN-6ENaNSmallHydraulic Excavator, Track - 12.0 to 14.0 Metric TonsTexasTEXTrack ExcavatorsNaNEROPS w ACNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN2 ValveNaNNaNNaNNaNNaNNone or UnspecifiedNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
4113925311000.01057373173111213.02007722.0Medium7/23/2009 0:00S175S175NaNNaNNaNNaNSkid Steer Loader - 1601.0 to 1751.0 Lb Operating CapacityNew YorkSSLSkid Steer LoadersNaNEROPSNone or UnspecifiedNaNNaNNaNNaNNaNNaNNaNNaNNaNAuxiliaryNaNNaNNaNNaNNaNNone or UnspecifiedNone or UnspecifiedNone or UnspecifiedStandardNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
5113925526500.0100127446051213.02004508.0Low12/18/2008 0:00310G310GNaNNaNNaNBackhoe Loader - 14.0 to 15.0 Ft Standard Digging DepthArizonaBLBackhoe LoadersFour Wheel DriveOROPSNone or UnspecifiedNone or UnspecifiedNoExtendedPowershuttleNone or UnspecifiedNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
6113925621000.077270119371213.0199311540.0High8/26/2004 0:00790ELC790ENaNLCLarge / MediumHydraulic Excavator, Track - 21.0 to 24.0 Metric TonsFloridaTEXTrack ExcavatorsNaNEROPSNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNStandardNaNNaNNaNNaNNaNNone or UnspecifiedNaNNaNNaNSteelNone or UnspecifiedNone or UnspecifiedNone or UnspecifiedNone or UnspecifiedDoubleNaNNaNNaNNaNNaN
7113926127000.090200235391213.020014883.0High11/17/2005 0:00416D416DNaNNaNNaNBackhoe Loader - 14.0 to 15.0 Ft Standard Digging DepthIllinoisBLBackhoe LoadersFour Wheel DriveOROPSNone or UnspecifiedReversibleNoStandardStandardYesNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
8113927221500.01036251360031213.02008302.0Low8/27/2009 0:00430HAG430HAGNaNNaNMiniHydraulic Excavator, Track - 3.0 to 4.0 Metric TonsTexasTEXTrack ExcavatorsNaNEROPSNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNAuxiliaryNaNNaNNaNNaNNaNManualNaNNaNNaNRubberNone or UnspecifiedNone or UnspecifiedNone or UnspecifiedNone or UnspecifiedDoubleNaNNaNNaNNaNNaN
9113927565000.0101647438831213.0100020700.0Medium8/9/2007 0:00988B988BNaNNaNLargeWheel Loader - 350.0 to 500.0 HorsepowerFloridaWLWheel LoaderNaNEROPS w ACNone or UnspecifiedNaNNone or UnspecifiedNaNNaNNaNNaNNaNNaNNaN2 ValveNaNNaNNaNNaNNone or UnspecifiedNone or UnspecifiedNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNStandardConventional
SalesIDSalePriceMachineIDModelIDdatasourceauctioneerIDYearMadeMachineHoursCurrentMeterUsageBandsaledatefiModelDescfiBaseModelfiSecondaryDescfiModelSeriesfiModelDescriptorProductSizefiProductClassDescstateProductGroupProductGroupDescDrive_SystemEnclosureForksPad_TypeRide_ControlStickTransmissionTurbochargedBlade_ExtensionBlade_WidthEnclosure_TypeEngine_HorsepowerHydraulicsPushblockRipperScarifierTip_ControlTire_SizeCouplerCoupler_SystemGrouser_TracksHydraulics_FlowTrack_TypeUndercarriage_Pad_WidthStick_LengthThumbPattern_ChangerGrouser_TypeBackhoe_MountingBlade_TypeTravel_ControlsDifferential_TypeSteering_Controls
412688633330511500.01800259214371491.02006NaNNaN2/13/2012 0:0035N35NNaNNaNMiniHydraulic Excavator, Track - 3.0 to 4.0 Metric TonsFloridaTEXTrack ExcavatorsNaNEROPSNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNStandardNaNNaNNaNNaNNaNNone or UnspecifiedNaNNaNNaNSteelNone or UnspecifiedNone or UnspecifiedNone or UnspecifiedNone or UnspecifiedDoubleNaNNaNNaNNaNNaN
412689633331413000.01908162214371492.02006NaNNaN1/28/2012 0:0035N35NNaNNaNMiniHydraulic Excavator, Track - 3.0 to 4.0 Metric TonsFloridaTEXTrack ExcavatorsNaNEROPSNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNAuxiliaryNaNNaNNaNNaNNaNNone or UnspecifiedNaNNaNNaNRubberNone or UnspecifiedNone or UnspecifiedNone or UnspecifiedNone or UnspecifiedDoubleNaNNaNNaNNaNNaN
412690633333020500.01879923214461492.02006NaNNaN1/28/2012 0:0055N255N2NaNMiniHydraulic Excavator, Track - 5.0 to 6.0 Metric TonsFloridaTEXTrack ExcavatorsNaNEROPSNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNAuxiliaryNaNNaNNaNNaNNaNNone or UnspecifiedNaNNaNNaNRubberNone or UnspecifiedNone or UnspecifiedNone or UnspecifiedNone or UnspecifiedDoubleNaNNaNNaNNaNNaN
412691633333913000.01856845214351492.02005NaNNaN1/28/2012 0:0030NX30NXNaNNaNMiniHydraulic Excavator, Track - 2.0 to 3.0 Metric TonsFloridaTEXTrack ExcavatorsNaNEROPSNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNAuxiliaryNaNNaNNaNNaNNaNNone or UnspecifiedNaNNaNNaNRubberNone or UnspecifiedNone or UnspecifiedNone or UnspecifiedNone or UnspecifiedDoubleNaNNaNNaNNaNNaN
412692633334310000.01799614214351491.02005NaNNaN2/13/2012 0:0030NX30NXNaNNaNMiniHydraulic Excavator, Track - 2.0 to 3.0 Metric TonsFloridaTEXTrack ExcavatorsNaNEROPSNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNStandardNaNNaNNaNNaNNaNNone or UnspecifiedNaNNaNNaNSteelNone or UnspecifiedNone or UnspecifiedNone or UnspecifiedNone or UnspecifiedDoubleNaNNaNNaNNaNNaN
412693633334410000.01919201214351492.02005NaNNaN3/7/2012 0:0030NX30NXNaNNaNMiniHydraulic Excavator, Track - 2.0 to 3.0 Metric TonsTexasTEXTrack ExcavatorsNaNEROPSNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNStandardNaNNaNNaNNaNNaNNone or UnspecifiedNaNNaNNaNSteelNone or UnspecifiedNone or UnspecifiedNone or UnspecifiedNone or UnspecifiedDoubleNaNNaNNaNNaNNaN
412694633334510500.01882122214361492.02005NaNNaN1/28/2012 0:0030NX230NX2NaNMiniHydraulic Excavator, Track - 3.0 to 4.0 Metric TonsFloridaTEXTrack ExcavatorsNaNEROPSNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNAuxiliaryNaNNaNNaNNaNNaNNone or UnspecifiedNaNNaNNaNSteelNone or UnspecifiedNone or UnspecifiedNone or UnspecifiedNone or UnspecifiedDoubleNaNNaNNaNNaNNaN
412695633334712500.01944213214351492.02005NaNNaN1/28/2012 0:0030NX30NXNaNNaNMiniHydraulic Excavator, Track - 2.0 to 3.0 Metric TonsFloridaTEXTrack ExcavatorsNaNEROPSNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNAuxiliaryNaNNaNNaNNaNNaNNone or UnspecifiedNaNNaNNaNRubberNone or UnspecifiedNone or UnspecifiedNone or UnspecifiedNone or UnspecifiedDoubleNaNNaNNaNNaNNaN
412696633334810000.01794518214351492.02006NaNNaN3/7/2012 0:0030NX30NXNaNNaNMiniHydraulic Excavator, Track - 2.0 to 3.0 Metric TonsTexasTEXTrack ExcavatorsNaNEROPSNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNAuxiliaryNaNNaNNaNNaNNaNNone or UnspecifiedNaNNaNNaNRubberNone or UnspecifiedNone or UnspecifiedNone or UnspecifiedNone or UnspecifiedDoubleNaNNaNNaNNaNNaN
412697633334913000.01944743214361492.02006NaNNaN1/28/2012 0:0030NX230NX2NaNMiniHydraulic Excavator, Track - 3.0 to 4.0 Metric TonsFloridaTEXTrack ExcavatorsNaNEROPSNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNAuxiliaryNaNNaNNaNNaNNaNNone or UnspecifiedNaNNaNNaNRubberNone or UnspecifiedNone or UnspecifiedNone or UnspecifiedNone or UnspecifiedDoubleNaNNaNNaNNaNNaN